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1 . This international preliminary examination report has been prepared by this International Preliminary Examining Authority 
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□ This report is also accompanied by ANNEXES, i.e., sheets of the description, claims and/or drawings 
which have been amended and are the basis for this report and/or sheets containing rectifications made 
before this Authority (see Rule 70.1 6 and Section 607 of the Administrative Instructions under the PCT). 

These annexes consist of a total of sheets. 



3. This report contains indications relating to the following items: 

I S Basis of the report 

II □ Priority 

III Non-establishment of opinion with regard to novelty, inventive step and industrial applicability 
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V [5 Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
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VII □ Certain defects in the international application 

VIII □ Certain observations on the international application 
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Form PCT [PEA 409 (oover sheet) (January 1994 



INTERNATIONAL PRELIMINARY 

EXAMINATION REPORT International application No. PCT/JP97/03239 

I. Basis of the report 

1 . This report has been drawn on the basis of {substitute sheets which have been furnished to the receiving Office in 
response to an invitation under Article 14 are referred to in this report as "originally filed" and are not annexed to 
the report since they do not contain amendments.): 

Description, pages: 

1-116 as originally filed 

Claims, No.: 

1 -4 as originally tiled 

Drawings, sheets: 

1/11-11/1 1 as originally filed 



2. The amendments have resulted in the cancellation of: 

□ the description, pages: 

□ the claims, Nos.: 

□ the drawings, sheets: 

3. □ This report has been established as if (some of) the amendments had not been made, since they have been 

considered to go beyond the disclosure as filed (Rule 70.2(c)): 



4. Additional observations, if necessary: 



III. Non-establishment of opinion with regard to novelty, inventive step and industrial applicability 

The questions whether the claimed invention appears to be novel, to involve an inventive step (to be non-obvious) 
or to be industrially applicable have not been examined in respect of: 



because 
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□ the said international application, or the said claims Nos. relate to the tollowing subject matter which does 
not require an international preliminary examination (specify): 

□ the description, claims or drawings (indicate particular elements below) or said claims Nos. are so unclear 
that no meaningful opinion could be formed (specify): 

□ the claims, or said claims Nos. are so inadequately supported by the description that no meaningful opinion 
could be formed. 

IS no international search report has been established for the said claims Nos. 1-4 (partially). 
IV. Lack of unity of invention 

1 . In response to the invitation to restrict or pay additional fees the applicant has: 

□ restricted the claims. 

□ paid additional fees. 

□ paid additional fees under protest. 

□ neither restricted nor paid additional fees. 

2. K This Authority found that the requirement of unity of invention is not complied and chose, according to Rule 

68.1 , not to invite the applicant to restrict or pay additional tees. 

3. This Authority considers that the requirement of unity of invention in accordance with Rules 13.1, 13.2 and 13.3 is 

□ complied with. 

not complied with for the following reasons: 

see separate sheet 

4 Consequently, the following parts of the international application were the subject of international preliminary 



K the parts relating to claims Nos. 1 -4 (partially) 
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V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial 
applicability; citations and explanations supporting such statement 



1 . Statement 



Novelty (N) 


Yes: 


Claims 


1-4 




No: 


Claims 




Inventive step (IS) 


Yes: 


Claims 






No: 


Claims 


1-4 


Industrial applicability (IA) 


Yes: 


Claims 


1-4 




No: 


Claims 





2. Citations and explanations 
see separate sheet 
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EXAMINATION REPORT - SEPARATE SHEET 



The following documents (D) are mentioned for the first time in this opinion/report; 
the numbering will be adhered to in the rest of the procedure: 

D1...Gene, vol. 163, 1995, pages 193-196 (Yokoyama-Kobayashi et al.) 
D2...NUC. Acids Res., vol. 14, 1986, pages 4683-4690 (Von Heijne) 
D3... Science, vol. 261, 1993, pages 600-603 (Tashiro et al.) 



1) The International Searching Authority made an objection concerning lack of unity 
within the application as originaiiy fiied (Ruie i3.1 - 13.3 PCT).The objection is 
summarised below: 

The following 9 inventions identified within originally filed claims 1-4 are not so 
linked as to form a single general inventive concept: 

Invention 1 . DNAs relating to SEQ ID No 10 and 19 and protein relating to SEQ ID 
No 1 (claims 1-4 partially). 

Invention 2. DNAs relating to SEQ ID No 1 1 and 20 and protein relating to SEQ ID 
No 2 (claims 1-4 partially). 

Invention 3. DNAs relating to SEQ ID No 12 and 21 and protein relating to SEQ ID 
No 3 (claims 1-4 partially). 

Invention 4. DNAs relating to SEQ ID No 13 and 22 and protein relating to SEQ ID 
No 4 (claims 1-4 partially). 

Invention 5. DNAs relating to SEQ ID No 14 and 23 and protein relating to SEQ ID 
No 5 (claims 1-4 partially). 



Invention 7. DNAs relating to SEQ ID No 16 and 25 and protein relating to SEQ ID 



IV) Unity 



! : U m 
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No 7 (claims 1-4 partially). 

Invention 8. DNAs relating to SEQ ID No 17 and 26 and protein relating to SEQ ID 
No 8 (claims 1-4 partially). 

Invention 9. DNAs relating to SEQ ID No 18 and 27 and protein relating to SEQ ID 
No 9 (claims 1-4 partially). 

3) The only common concept linking the above inventions is the presence of a 
secretory signal sequence. However, since secretory signal sequences and 
methods for detecting them were disclosed in the prior art (D1-D3), this concept is 
not novel. 

4) In response to an invitation by the ISA, the applicant paid no additional search 
fees. Consequently, substantive examination has been carried out on invention 1 , 
which has been searched and is considered to be the main invention, being that 
which is first mentioned in the claims. 

V) Reasoned statement 

Inventive Step 

1) The present application does not satisfy the criterion set forth in Article 33 (3) 
PCT because the subject-matter of claims 1-4 does not involve an inventive step 
(Rule 65.1 and 65.2 PCT). 

2) Although the application discloses nucleotide sequences corresponding to a cDNA 
clone isolated from a human cDNA library and the "best guess" open reading 
frame (ORF) identifiable with it, it provides no evidence of the in vitro translated 
polypeptide's biological role - the description only states on page 22 what it is not 
(i.e. RANTES). Consequently, the invention of the present application is merely 



In this case any prior art compound, regardless of its technical properties, is 
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equally suitable as the starting point for making structural modifications and may 
be considered to represent "the closest prior art". 

Starting from this point, the only technical problem which may be derived is how to 
provide a different compound . 

Without the concomitant need to provide any particular technical effect, the skilled 
person would have the choice of an infinite number of equally obvious possible 
solutions. An arbitrary selection from this list cannot involve an inventive step 
because, in order to be patentable, a selection must be justified by a technical 
purpose, i.e. by a hitherto unknown or unexpected technical effect resulting from 
those structural features which distinguish the compound claimed from all the 
other possibilities. 

Thus, for nucleotide and peptide sequences whose function is based purely upon 
surmise, inventive step cannot be acknowledged. 

Furthermore, if an invention should provide a solution to a problem with reference 
to the background art (Rule 5.1 (a) (iii) PCT), the "invention" of the present 
application is insufficiently disclosed (Article 5 PCT) and unclear (Article 6 PCT) f 
since it is left to the reader to perform the invention and determine what problem, 
if any, the isolated nucleotide or polypeptide sequences actually solve. 
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INTERNATIONAL SEARCH REPORT 

(PCT Article 1 8 and Rules 43 and 44) 



Applicant's or agent's file reference 

660492 


POP FURTHER see Notification of Transmittal of International Search Report 

(Form PCT/ISA/220) as well as, where applicable, item 5 below 

ACTION 


International application No 

PCT/JP 97/03239 


International filing date (day/monttiyear) 

12/09/1997 


(Earliest) Priority Date (day/month/year) 

13/09/1996 


Applicant 

SAGAMI CHEMICAL RESEARCH CENTER et al. 



This International Search Report has been orepared by this International Searching Authority and is transmitted to the applicant 
according to Article 1 8. A copy is being transmitted to the International Bureau. 



This International Search Report consists ot atotai of 



| | It is also accompanied by a copy of each prior art document cited in this report. 



1 Certain claims were found unsearchable (see Box I) 

2. [^] Unity of invention is lacking (see Box II) 

3. |~v~] The international application contains disclosure of a nucleotide andfor amino acid sequence listing and the 

international search was carried out on the basis of the sequence listing 

| | filed with the international application. 

furnished by the applicant separately from the international application, 

i I but not accompanied by a statement to the effect that it did not include 
— matter going beyond the disclosure in the international application as fi 



filed. 



| | Transcribed by this Authonty 



4 With regard to the title, [^J the text is approved as submitted by the applicant 

| | the text has been established by this Authonty to read as follow: 



5. With regard to the abstract, 



| y | the text is approved as submitted by the applicant. 

I | the text has been established, according to Rule 38 2(b), by this Authority as it appears in 

Box III. The applicant may, within one month from the date of mailing of this International 

q Pn rcH Report, submit comments to this Authority 



because the applicant failed to suggest a *:gure 
because this figure better characterizes the invention 
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Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 



This International Search Report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons 



Claims Nos 

because they relate to subject matter not required to be searched by this Authority, namely. 



Claims Nos. : 

because they relate to parts of the International Application that do not comply with the prescribed requirements to such 
an extent that no meaningful International Search can be earned out, specifically: 



3. 



Claims Nos 

because they arc dependent claims and arc* nnt drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this internationai application, as follows: 

See annex 



As ail required additional search fees were timely paid by the applicant, this International Search Report covers all 
searchable claims. 



As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 
of any additional fee. 



As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
covers only those claims for which fees were paid, specifically claims Nos. 



No required additional search fees were timely paid by the applicant. Consequently, this International Search Report is 
restricted to the invention first mentioned in the claims, it is covered by claims Nos.: 



Remark on Protest 



The additional searcn fees were accompanied by the applicant s proles 
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FURTHER INFORMATION CONTINUED FROM PCT/1SA/ 210 



1. Claims: Claims 1-4 in part 

DNAs relating to SEQ ID No 10 and 19 and protein relating to 
SEQ ID No 1 

2. Claims: Claims 1-4 in part 

DNAs relating to SEQ ID No 11 and 20 and protein relating to 
SEQ ID No 2 

3. Claims: Claims 1-4 in part 

DNAs relating to SEQ ID No 12 and 21 and protein relating to 
SEQ ID No 3 

4. Claims: Claims 1-4 in part 

DNAs relating to SEQ ID No 13 and 22 and protein relating to 
SEQ ID No 4 

5. Claims: Claims 1-4 in part 

DNAs relating to SEQ ID No 14 and 23 and protein relating to 
SEQ ID No 5 

6. Claims: Claims 1-4 in part 

DNAs relating to SEQ ID No 15 and 24 and protein relating to 
SEQ ID No 6 

7. Claims: Claims 1-4 in part 

DNAs relating to SEQ ID No 16 and 25 and protein relating to 
SEQ ID No 7 

8. Claims: Claims 1-4 in part 



9. Claims: Claims 1-4 in part 

DNAs relating to SEQ ID No 18 and 27 and protein relating to 
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i 



INTERNATIONAL SEARCH REPORT 


^mj^^national Application No 

PCT/JP 97/03239 


A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 C12N15/12 C07K14/47 C12N15/62 






According to International Patent Classification (IPC) or to both national classification and IPC 






B. FIELDS SEARCHED 


Minimum documentation searched (classification system followed by classification symbols) 

IPC 6 C12N C07K 


Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 


Electronic data base consulted during the international search (name of data base and, where practical, search terms used) 


C. DOCUMENTS CONSIDERED TO BE RELEVANT 


Category ° 


Citation of document, with indication, where appropriate, of the relevant passages 


Relevant to claim No. 


Y 


YOKOYAMA-KOBAYASHI M ET AL. : "A signal 
sequence detection system using secreted 
protease activity as an indicator" 
GENE. , 

vol. 163, 1995, AMSTERDAM NL, 
pages 193-196, XP002053953 
cited in the application 
see the whole document 




1-4 


Y 


VON HEIJNE G: "A new method for 

predicting signal sequence cleavage sites" 

NUCLEIC ACIDS RESEARCH., 

vol. 14, 1986, OXFORD GB, 

pages 4683-4690, XP002053954 

cited in the application 

see the whole document 

-/-- 
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| X j Further documents are listed in the continuation of box C | )( j Patent family members are listed i 


n annex 


0 Special categories of cited documents . m , t , , , , „ , . . , tii . . 

K a "T later document published after the international filing date 

or prionty date and not in conflict wrth the application but 
"A" document defining the general state of the art which is not cjted to understand the principle or theory underlying the 

considered to be of particular relevance invention 
"E" earlier document but published on or after the international - x - document of particular relevance, the claimed invention 

filing date cannot be considered novel or cannot be considered to 
"L" document which may throw doubts on priority claim(s) or involve an inventive step when the document is taken alone 

which is cited to establish the publication date of another -y- document of particular relevance, the claimed invention 

citation or other special reason (as specified) cannot be considered to invorve an inventive step when the 
"O" document referring to an oral disclosure, use, exhibition or document is combined wrth one or more other such docu- 

other means ments, such combination being obvious to a person skilled 
' m the art 
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Citatmn of document, wrth indication, where appropriate, of the relevant passages 


Relevant to claim No 
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TASHIRO K ET AL: "SIGNAL SEQUENCE TRAP: A 


1-4 




CLONING STRATEGY FOR SECRETED PROTEINS AND 






TYPE I MEMBRANE PROTEINS" 






SCIENCE, 






vol . 261, 30 July 1993, 






pages 600-603, XP000673204 






see the whole document 
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J (US); LEDBETTER STEVEN R (US)) 9 






February 1995 






see page 51; claim 13 
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(54) Title: HUMAN PROTEINS HAVING SECRETORY SIGNAL SEQUENCES AND DNAs ENCODING THESE PROTEINS 
(57) Abstract 

[Problems to be solved] To provide human proteins having secretory signal sequences and cDNAs encoding said proteins. [Means 
to solve the problems] Proteins containing any of the amino acid sequences represented by Sequence No. 1 to Sequence No. 9 and DNAs 
encoding said proteins exemplified by cDNAs containing any of the base sequences represented by Sequence No. 10 to Sequence No. 18. 
Said proteins can be provided by expressing cDNAs encoding human proteins having secretory signal sequences with verified secretory 
functions and recombinants of these human cDNAs. 
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DESCRIPTION 



Human Proteins Having Secretory 
Signal Sequences and DNAs Encoding These Proteins 

TECHNICAL FIELD 

The present invention relates to human proteins having 
secretory signal sequences and DNAs encoding these proteins. 
The proteins of the present invention can be used as 
pharmaceuticals or as antigens for preparing antibodies 
against said proteins. The cDNAs of the present invention can 
be used as probes for the gene diagnosis and gene sources for 
the gene therapy. Furthermore, the cDNAs can be used as gene 
sources for large-scale production of the proteins encoded by 
said cDNAs . 

BACKGROUND ART 

Cells secrete many proteins outside the cells. These 
secretory proteins play important roles for the proliferation 
control, the differentiation induction, the material 
transportation, the biological protection, etc. in the cells. 
Different from intracellular proteins, the secretory proteins 
exert their actions outside the cells, whereby they can be 
administered in the intracorporeal manner such as the 
injection or the drip to anticipate the potentialities as 



thrombolytic agents, etc. have been currently utilized as 
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medicines. In addition, secretory proteins other than those 
described above have been undergoing clinical trials to 
develop as pharmaceuticals . Since it has been conceived that 
the human cells still produce many unknown secretory 
proteins, availability of these secretory proteins as well as 
genes encoding them is expected to lead to the development of 
novel pharmaceuticals using these proteins. 

Heretofore, such a secretory protein has been obtained by 
a method comprising the isolation and purification of the 
target protein from a larqe amount of the blood or a cell 
culture supernatant by using the biological activity as an 
indicator, determination of its primary structure followed by 
cloning of the corresponding cDNA on the basis of the 
information on the thus -obtained amino acid sequence , and 
production of the recombinant protein using said cDNA. 
However, the contents of the secretory proteins are 
generally so low that the isolation and purification are 
difficult in many cases . On the other hand , secretory 
proteins and type-I membrane proteins possess hydrophobic 
sequences , defined as the secretory signal sequences , 
consisting of about 20 amino acid residues at the amino acid 
termini (the N-termini ) . Therefore, the cloning of genes 
encoding the secretory proteins or type-I membrane proteins 
is expected to be performed by using the presence or the 
absence of these secretory signal sequences as indicators . 



human proteins having secretory signal sequences and DNAs 
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encoding said proteins . 

As the result of intensive studies, the present inventors 
were successful in cloning of cDNAs having secretory signal 
sequences from a human full-length cDNA bank, thereby 
completing the present invention. That is to say, the present 
invention provides proteins containing any of the amino acid 
sequences represented by Sequence No. 1 to Sequence No. 9 
that are human proteins having secretory signal sequences . 
The present invention, also, provides DNAs encoding said 
proteins exemplified as cDNAs containing any of the base 
sequences represented by Sequence No. 10 to sequence No. 18. 

Each of the proteins of the present invention can be 
obtained, for example, by a method for isolation from human 
organs, cell lines, etc, a method for preparation of the 
peptide by the chemical synthesis on the basis of the amino 
acid sequence of the present invention, or a method for 
production with the recombinant DNA technology using the DNA 
encoding the human secretory protein of the present 
invention, wherein the method for obtainment by the 
recombinant DNA technology is employed preferably. For 
example , an in vitro expression can be achieved by 
preparation of an RNA by the in vitro transcription from a 
vector having a cDNA of the present invention , followed by 
the in vitro translation using this RNA as a template. Also, 
the recombination of the translation domain to a suitable 
expression vector by the method known in the art leads to the 



and so on . 
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In the case in which a protein of the present invention 
is expressed by a microorganism such as Escherichia, coli, the 
translation region of a cDNA of the present invention is 
constructed in an expression vector having an origin , a 
promoter, ribosome-binding site(s), cDNA-cloning site(s), a 
terminator, etc. that can be replicated in the microorganism 
and, after transformation of the host cells with said 
expression vector, the thus-obtained transformant is 
incubated, whereby the protein encoded by said cDNA can be 
produced on a large scale in the microorganism. In that case, 
a maturation protein can be obtained by performing the 
expression with inserting an initiation codon in the 
translation region where the secretary signal sequence is 
removed. Alternatively, a fusion protein with another protein 
can be expressed. Only a protein portion encoding said cDNA 
can be obtained by cleavage of said fusion protein with an 
appropriate protease . 

In the case in which a protein of the present invention 
is secretory-expressed in animal cells, the protein of the 
present invention can be secretory-produced as a maturation 
protein outside the cells, when the translation region of 
said cDNA is subjected to recombination to an expression 
vector for animal cells that has a promoter for the animal 
cells, a splicing domain, a poly(A) addition site, etc., 
followed by transfection into the animal cells. 

The proteins of the present invention include peptide 



represented by Sequence No. 1 to Sequence No. 9. These? 
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fragments can be used as antigens for preparation of the 
antibodies. Also, the proteins of the present invention are 
secreted in the form of maturation proteins outside the 
cells, after the signal sequences are removed. Therefore, 
these maturation proteins shall come within the scope of the 
present invention. The N-terminal amino acid sequences of the 
maturation proteins can be easily identified by using the 
method for the cleavage-site determination in a signal 
sequence [Japanese Patent Kokai Publication No. 1996-187100]. 
Furthermore, many secretory proteins are subjected to the 
processing after the secretion to be converted to the active 
forms. These activated proteins or peptides shall come within 
the scope of the present invention. When glycosylation sites 
are present in the amino acid sequences, expression in 
appropriate animal cells affords glycosylated proteins . 
Therefore, these glycosylated proteins or peptides also shall 
come within the scope of the present invention. 

The DNAs of the present invention include all DNAs 
encoding the above-mentioned proteins . Said DNAs can be 
obtained using the method by chemical synthesis, the method 
by cDNA cloning , and so on . 

Each of the cDNAs of the present invention can be cloned 

from, for example , a cDNA library of the human cell origin. 

+ 

The cDNA is synthesized using as a template a poly (A ) RNA 

extracted from human cells. The human cells may be cells 

Hp 1 t"oy-oH ^rnn^ fho h Mm h n dv . f n^* ^yamol p , hv th^ operation 



[Okayama, H. and Berg, F . , Mol . Cell. Biol. 2: 161-170 
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(1982)], the Gubler-Hof f man method [Gubler, U. and Hoffman, 
J. Gene 25: 263-269 (1983)], and so on, but it is preferred 
to use the capping method [Kato, S. et al . , Gene 150: 243-250 
(1994)] as illustrated in Examples in order to obtain a full- 
length clone in an effective manner. 

The primary selection of a cDNA encoding a human protein 
having a secretory signal sequence is performed by the 
sequencing of a partial base sequence of the cDNA clone 
selected at random from the cDNA library, sequencing of the 
amino acid sequence encoded by the base sequence, and 
recognition of the presence or absence of hydrophobic site(s) 
in the resulting N-terminal amino acid sequence region. Next, 
the secondary selection is carried out by determination of 
the whole base sequence by the sequencing and the protein 
expression by the in vitro translation. The ascertainment of 
the cDNA of the present invention for encoding the protein 
having the secretory signal sequence is performed by using 
the signal sequence detection method [ Yokoyama-Kobayashi , M. 
et al., Gene 163 : 193-196 ( 1995 )]. In other words, the 
ascertainment for the coding portion of the inserted cDNA 
fragment to function as a signal sequence is provided by 
fusing a cDNA fragment encoding the N-terminus of the target 
protein with a cDNA encoding the protease domain of urokinase 
and then expressing the resulting cDNA in COS7 cells to 
detect the urokinase activity in the cell culture medium. 

The cDNAs of the present invention are characterized by 



1 1 1 I. . T 



represented by Sequence No. 19 to Sequence No. 27. Table 1 
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summarizes the clone number (HP number), the cells affording 
the cDNA, the total base number of the cDNA, and the number 
of the amino acid residues of the encoded protein, for each 
of the cDNAs . 

Table 1 



Sequence 
Number 


HP 

Number 


Cel Is 


Number 
of Bases 


Number of 
Amino Acid 
Residues 


1. 10, 19 


HP00658 


HT-1080 


1296 


154 


2. 11. 20 


HP007J4 


KB 


331 J 


315 


3. 12. 21 


HP00876 


Stomach cancer 


1152 


158 


4, 13. 22 


HP01134 


Liver 


1749 


376 


5, 14, 23 


HP10029 


KB 


988 


173 


6, 15. 24 


HP10189 


KB 

— 


390 


93 


7, 16, 25 


HP10269 


U937 


4667 


1172 


8. 17. 26 


HP10298 


Stomach cancer 


1086 


122 


9. 18. 27 


HP10368 


Stomach cancer 


866 


175 



Hereupon, the same clone as any of the cDNAs of the 
present invention can be easily obtained by screening of the 
cDNA library constructed from the cell line or the human 
tissue employed in the present invention, by the use of an 
oligonucleotide probe synthesized on the basis of the 
corresponding cDNA base sequence depicted in Sequence No. 19 
to Sequence No. 27. 



any cDNA that is subjected to insertion or deletion of one or 





WO 98/11217 PCT/JP97/03239 



8 

plural nucleotides and/or substitution with other nucleotides 
in Sequence No. 10 to Sequence No. 27 shall come within the 
scope of the present invention. 

In a similar manner, any protein that is produced by 
these modifications comprising insertion or deletion of one 
or plural nucleotides and/or substitution with other 
nucleotides shall come within the scope of the present 
invention, as far as said protein possesses the activity of 
the corresponding protein having the amino acid sequence 
r^nrpfipntpd bv Seauence No. 1 to Seauence No. 9. 

The cDNAs of the present invention include cDNA fragments 
(more than 10 bp) containing any partial base sequence of the 
base sequence represented by Sequence No. 10 to No. 18 or of 
the base sequence represented by Sequence No. 19 to No. 27. 
For example, as illustrated in Examples, the portion encoding 
the secretory signal sequence can be employed as means to 
secrete an optionally selected protein outside the cells by 
fusing with a cDNA encoding another protein. Also, DNA 
fragments consisting of a sense chain and an anti-sense chain 
shall come within this scope. These DNA fragments can be used 
as the probes for the gene diagnosis . 



BRIEF DESCRIPTION OF DRAWINGS 

Figure 1: A figure depicting the structure of the 

secretory signal sequence detection vector pSSD3 . 

Ficrure 2: A fiaure depictinq the construction of the 



. .... vj Li I « : . . r ■ 

hydrophobicity/hydrophilicity profile of the protein encoded 
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by clone HP00685 . 

Figure 4 : A figure depicting the 

hydrophobicity /hydrophi licity profile of the protein encoded 
by clone HP00714 . 

Figure 5: A figure depicting the 

hydrophobic! ty/hydrophi licity profile of the protein encoded 
by clone HP00876. 

Figure 6: A figure depicting the 

hydrophobicity/hydrophi licity profile of the protein encoded 
by clone HP01134 . 

Figure 7: A figure depicting the 

hydrophobicity/hydrophilicity profile of the protein encoded 
by clone HP10029 . 

Figure 8: A figure depicting the 

hydrophobicity/hydrophilicity profile of the protein encoded 
by clone HP10189 . 

Figure 9 : A figure depicting the 

hydrophobicity/hydrophilicity profile of the protein encoded 
by clone HP10269 . 

Figure 10: A figure depicting the 

hydrophobicity/hydrophilicity profile of the protein encoded 
by clone HP10298. 

Figure 11: A figure depicting the 

hydrophobicity/hydrophilicity profile of the protein encoded 
by clone HP10368. 



The present invention is embodied in more detail by the 
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following examples, but this embodiment is not intended to 
restrict the present invention. The basic operations and the 
enzyme reactions with regard to the DNA recombination are 
carried out according to the literature ["Molecular Cloning. 
A Laboratory Manual", Cold Spring Harbor Laboratory, 1989]. 
Unless otherwise stated, restrictive enzymes and a variety of 
modification enzymes to be used were those available from 
Takara Shuzo Co., Ltd. The manufacturer's instructions were 
used for the buffer compositions as well as for the reaction 
conditions, in each of the enzyme reactions. The cDNA 
synthesis was carried out according to the literature [Kato, 
S. et al., Gene 150: 243-250 (1994)]. 
(1) Preparation of Poly(A) + RNA 

The fibrosarcoma cell line HT-1080 (ATCC CCL 121), the 
epidermoid carcinoma cell line KB (ATCC CRL 17 ) , the 
histiocyte lymphoma cell line U937 (ATCC CRL 1593) stimulated 
by phorbol esters, tissues of stomach cancer delivered by the 
operation, and liver were used for human cells to extract 
mRNAs . Each of the cell lines was cultured by a conventional 
procedure . 

After about 1 g of human tissues was homogenized in 20 ml 
of a 5 . 5 M guanidinium thiocyanate solution, total mRNAs were 
prepared in accordance with the literature [Okayama, H. et 
al . , "Methods in Enzymology" Vol. 164, Academic Press, 1987 ]. 
These mRNAs were subjected to chromatography using an 
o 1 ian ( dT ) -eel lulose col umn washed wi th 2 0 mK Tris- 



above-mentioned literature . 
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(2) Construction of cDNA Library 

To a solution of 10 \ig of the above-mentioned poly(A) + RNA 
in 100 mM Tris-hydrochloric acid buffer solution ( pH 8) was 
added one unit of an RNase-free, bacterium-origin alkaline 
phosphatase and the resulting solution was allowed to react 
at 37 °C for one hour . After the reaction solution underwent 
the phenol extraction followed by the ethanol precipitation, 
the obtained pellets were dissolved in a mixed solution of 50 
mM sodium acetate (pH 6), 1 mM EDTA, 0.1% 2-mercaptoethanol , 
and 0.01% Triton X-100. Thereto was added one unit, of a 
tobacco-origin pyrophosphatase (Epicenter Technologies) and 
the resulting solution at a total volume of 100 jil was 
allowed to react at 37 °C for one hour . After the reaction 
solution underwent the phenol extraction followed by the 
ethanol precipitation, the thus-obtained pellets were 
dissolved in water to obtain a decapped poly(A) + RNA 
solution . 

To a solution of the decapped poly(A) + RNA and 3 nmol of 
a DNA-RNA chimeric oligonucleotide ( 5 ' -dG-dG-dG-dG-dA-dA-dT- 
dT-dC-dG-dA-G-G-A- 3 ' ) in a mixed aqueous solution of 50 mM 
Tris-hydrochloric acid buffer solution ( pH 7.5), 0 . 5 mM ATP, 
5 mM MgCl 2 / 10 mM 2-mercaptoethanol, and 25% polyethylene 
glycol were added 50 units of T4 RNA ligase and the resulting 
solution at a total volume of 30 |al was allowed to react at 
20°C for 12 hours. After the reaction solution underwent the 
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After the vector pKAl developed by the present inventors 
( Japanese Patent Kokai Publication No. 1992-117292) was 
digested with Kpnl , an about 60-dT tail was inserted by a 
terminal transferase. This product was digested with EcoRV to 
remove the dT tail at one side and the resulting molecule was 
used as a vectorial primer. 

After 6 jag of the previously-prepared chimeric oligo- 
capped poly(A) + RNA was annealed with 1.2 jag of the vectorial 
primer, the product was dissolved in a mixed solution of 50 
mM Tris-hydrochloric acid buffer solution ( pH 8 = 3),- 75 mM 
KC1, 3 mM MgCl 2 , 10 mM dithiothrei tol , and 1 . 25 mM dNTP (dATP 
+ dCTP + dGTP + dTTP) , mixed with 200 units of a reverse 
transferase (GIBCO-BRL ) , and the resulting solution at a 
total volume of 20 \il was allowed to react at 42°C for one 
hour. After the reaction solution underwent the phenol 
extraction followed by the ethanol precipitation, the thus- 
obtained pellets were dissolved in a mixed solution of 50 mM 
Tris-hydrochloric acid buffer solution (pH 7.5), 100 mM NaCI, 
10 mM MgCl 2 , and 1 mM dithiothreitol . Thereto were added 100 
units of EcoRI and the resulting solution at a total volume 
of 20 ^1 was allowed to react at 37°C for one hour. After the 
reaction solution underwent the phenol extraction followed by 
the ethanol precipitation, the obtained pellets were 
dissolved in a mixed solution of 20 mM Tris-hydrochloric acid 
buffer solution ( pH 7.5), 100 mM KC1, 4 mM MgCl 2 , 10 mM 

re s u i. t iny Suiutio n wa s a i i owe ci *c o r ea c t a t ] (/ c 1 o r 16 hours. 
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To the reaction solution were added 2 ul of 2 mM dNTP , 4 
units of Escherichia coli DNA polymerase I, and 0.1 unit of 
Escherichia coli DNase H and the resulting solution was 
allowed to react at 12°C for one hour and then at 22°C for 
one hour. 

Next, the cDNA-synthesis reaction solution was used to 
transform Escherichia coli DH1 2S ( GIBCO-BRL ) . The 
transformation was carried out by the electroporation method. 
A portion of the trans formant was inoculated on a 2xYT agar 
culture medium containing 100 ug/ml ampicillin , which was 
incubated at 3 7 °C overnight . A colony grown on the culture 
medium was randomly picked up and inoculated on 2 ml of the 
2xYT culture medium containing 100 ug/ml ampicillin , which 
was incubated at 37 °C overnight. The culture medium was 
centrif uged to separate the cells , from which a plasmid DNA 
was prepared by the alkaline lysis method. After the plasmid 
DNA was double-digested with EcoRI and NotI, the product was 
subjected to 0.8% agarose gel electrophoresis to determine 
the size of the cDNA insert. In addition, by the use of the 
obtained plasmid as a template, the sequence reaction using 
M13 universal primer labeled with a fluorescent dye and Taq 
polymerase (a kit of Applied Biosystems Inc. ) was carried out 
and the product was analyzed by a fluorescent DNA-sequencer 
(Applied Biosystems Inc.) to determine the base sequence of 
the cDNA 5 '-terminal of about 400 bp. The sequence data were 
filed as a homo-protein cDNA bank data base. 



The base sequence registered in the homo-protein cDNA 
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bank was converted to three frames of amino acid sequences 
and the presence or absence of an open reading frame ( ORF ) 
beginning from the initiation codon . Then, the selection was 
made for the presence of a signal sequence that is 
characteristic to a secretory protein at the N-terminal of 
the portion encoded by ORF. These clones were sequenced from 
the both 5' and 3' directions by using the deletion method to 
determine the whole base sequence. The 
hydrophobicity/hydrophilicity profiles were obtained for 
proteins encoded by ORF by the Kyte-Dooli ttle method fKyte, 
J. & Doolittle, R. F. # J. Mol. Bio. 157: 105-132 (1982)] to 
examine the presence or absence of a hydrophobic region. In 
the case in which there is not a hydrophobic region of 
putative transmembrane domain(s) in the amino acid sequence 
of an encoded protein, this protein was considered as a 
membrane protein that did not possess a secretory protein or 
transmembrane domain ( s ) . 

(4) Construction of Secretory Signal Detection Vector 

pSSD3 

One microgram of pSSDl carrying the SV40 promoter and a 
cDNA encoding the protease domain of urokinase [Yokoyama- 
Kobayashi, M. et al . , Gene 163: 193-196 (1995)] was digested 
with 5 units of Bglll and 5 units of EcoRV. Then, after 
dephosphorylation at the 5' terminal by the CIP treatment, a 
DNA fragment of about 4.2 kbp was purified by cutting off 
from the qel of aqarose qel electrophoresis. 



phosphorylated by T4 polynucleotide kinase. After annealing 
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of the both linkers, followed by ligation with the 
previously- prepared pSSDl fragment by T4 DNA ligase , 
Escherichia coll JM109 was transformed. A plasmid pSSD3 was 
prepared from the trans formant and the objective recombinant 
was confirmed by the determination of the base sequence of 
the linker-inserted fragment. Figure 1 illustrates the 
structure of the thus-obtained plasmid. The present plasmid 
vector carries three types of blunt-end formation restriction 
enzyme sites, Smal, PmaCI, and EcoRV. Since these cleavage 
sites are positioned in succession at an interval of 7 bp, 
selection of an appropriate site in combination of three 
types of frames for the inserting cDNA allows to construct a 
vector expressing a fusion protein. 

( 5 ) Functional Verification of Secretory Signal Sequence 
Whether the N-terminal hydrophobic region in the 
secretory protein clone candidate obtained in the above- 
mentioned steps functions as the secretory signal sequence 
was verif ied by the method described in the literature 
[ Yokoyama-Kobayashi , M. et al. , Gene 163 : 193-196 ( 1995)]. 
First, the plasmid containing the target cDNA was cleaved at 
an appropriate restriction enzyme site that existed at the 
downstream from the portion expected for encoding the 
secretory signal sequence. In the case in which this 
restriction enzyme site was a protruding 5 ' -terminus , the 
site was blunt-ended by the Klenow treatment. Digestion with 
Hi nd TIT was further carried out and a DNA f ragmen t containinq 



gei electrophoresis. This fragment was inserted between the 
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pSSD3 Hindi I I site and a restriction enzyme site selected so 
as to match with the urokinase-coding frame, thereby 
constructing a vector expressing a fusion protein of the 
secretory signal portion of the target cDNA and the urokinase 
protease domain (refer to Figure 2). 

After Escherichia coli (host: JM109) bearing the fusion- 
protein expression vector was incubated at 37°C for 2 hours 
in 2 ml of the 2xYT culture medium containing 100 jig/ml 
ampicillin, the helper phage M13K07 (50 was added and the 

incubation was continued at 37 °C overnight. A supernatant 
separated by centrif ugation underwent precipitation with 
polyethylene glycol to obtain single-stranded phage 
particles. These particles were suspended in 100 \il of 1 mM 
Tris-0.1 mM EDTA, pH 8 ( TE ) . Also, there was used as a 
control a suspension of single-stranded particles prepared in 
the same manner from the vector pKAl-UPA containing pSSD3 
and a full-length cDNA of urokinase [ Yokoyama-Kobayashi , M. 
et al., Gene 163: 193-196 (1995)]. 

The simian-kidney-origin culture cells, C0S7 , were 
incubated at 37 °C in the presence of 5% C0 2 in the Dulbecco's 

modified Eagle's culture medium ( DMEM ) containing 10% fetal 

calf albumin . Into a 6 -we 11 plate ( Nunc Inc. , 3 cm in the 

5 

well diameter) were inoculated 1 x 10 C0S7 cells and 

incubation was carried out at 37 °C for 22 hours in the 

presence of 5% C0 2 • After the culture medium was removed, the 

— t 1 "ii r f n v T n ',rn f lip^ *-.■>■-"-*->-». -) oli ornl-i n + o Vm i f f i- — r -. "1 ■> i +- n n n n H 

iiyul'uChlUi , , ci ; . f j : : . _ , ' KM KM , . ; . , ; . > A'e l aade^: 

1 \il of the single-stranded phage suspension, 0.6 ml of the 



# 
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TM 

DM EM culture medium, and 3 fil of TRANS FECTAM (IBF Inc.) and 
the resulting mixture was incubated at 37°C for 3 hours in 
the presence of 5% C0 2 . After the sample solution was 
removed, the cell surface was washed with TDM EM , 2 ml per 
well of DM EM containing 10% fetal calf albumin was added, and 
the incubation was carried out at 37°C for 2 days in the 
presence of 5% C0 2 * 

To 10 ml of 50 mM phosphate buffer solution (pH 7.4) 
containing 2% bovine fibrinogen (Miles Inc.), 0.5% agarose, 

3 r>H T m \A v^/*^+-nc«c -I it yy\ o K 1 /— \ -v -i /-l ^ t.t/~\ n ^-J /-\ y-J T A -. ^ -? -i~ ^ -T- V, , 

thrombin (Mochida Pharmaceutical Co., Ltd.) and the resulting 
mixture was solidified in a plate of 9 cm in diameter to 
prepare a fibrin plate. Ten microliters of the culture 
supernatant of the transfected C0S7 cells were spotted on the 
fibrin plate, which was incubated at 37 °C for 15 hours. The 
diameter of the thus-obtained clear circle was taken as an 
index for the urokinase activity. Table 2 shows the 
restriction enzyme site used for cutting off the cDNA 
fragment from each clone, the restriction enzyme site used 
for cleavage of pSSD3 , and the presence or absence of a clear 
circle. Except for pSSD3 used as the control, each of the 
samples formed a clear circle to identify that urokinase was 
secreted in the culture medium. That is to say, it is 
indicated that each of the cDNA fragments codes for the amino 
acid sequence that functions as the secretory signal 
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Table 2 
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HP Number 


Restriction 


Enzyme Site 


Clear Circle 


cDNA* 


Vector 


HP00658 


Hindlll (K\ 




i>mai 


+ 


HP00714 


PvuII 

A_ V h>& ^ .L. 




Pmaci 


+ 


HP00876 


Ncol (K) 




PmaCI 


+ 


HP01134 


pinaci 




irmai- j. 




HP 1 0 02 9 


Apal (K) 




Smal 


+ 


HP10189 


Bgll (K) 




PmaCI 


+ 


HP10269 


PvuII 




PmaCI 




HP10298 


Hindlll (K) 




PmaCI 




HP10368 


EcoRV 




PmaCI 


+ 


pKAl-UPA 








+ 


pSSD3 











* (K) means that cleavage with the restriction enzyme is 
followed by the Klenow treatment • 



( 6 ) Protein Synthesis by In Vitro Translation 

The plasmid vector carrying the cDNA of the present 

invention was utili zed for the in vitro 

transcription /translation by the T N T rabbit reticulocyte 

35 

lysate kit ( Promega Biotec ) . In this case, [ SJmethionine 
was added and the expression product was labeled with the 
radioisotope. All reactions were carried out by following the 
protocols attached to the kit. Two micrograms of the plasmid 

va ^ R 1 1 DWOd fr> -r-p>F) <~ t ^ t- 7 0 ° ^ for- Q 0 rrn rm t o p i n total ? R ml of 



reticulocyte i y sa t o , . i> M . o i trie du i I a: suiuticn ( a l tacneu 
to the kit), 2 \xl of an amino acid mixture ( methionine- f ree ) , 
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3 5 

2 ul (0.37 MBq/|il) of [ SJmethionine (Amersham Corporation), 

0.5 ul of T7 RNA polymerase, and 20 U of RNasin. Also, the 
experiment in the presence of the membrane system was carried 
out by adding 2.5 ul of the dog pancreatic microsome fraction 
(Promega Biotec) into this reaction system. To 3 ul of the 
reaction solution was added 2 ul of an SDS sampling buffer 
(125 mM Tris-hydrochloric acid buffer solution, pH 6.8, 120 
mM 2-mercaptoethanol , 2% SDS solution, 0.025% bromophenol 
blue, and 20% glycerol) and the resulting solution was heated 
at 95°C for 3 minutes and then subjected to sns- 
polyacrylamide gel electrophoresis . The molecular weight of 
the translation product was determined by carrying out the 
autoradiography. Table 3 shows the molecular weight of the in 
vitro translation product obtained from each of the clones in 
the presence/absence of the membrane microsome together with 
the calculated value of the molecular weight of the protein 
encoded by ORF of the cDNA. 



WO 98/11217 



PCT/JP97/03239 



20 

Table 3 



Se- 
quence 
No . 


HP 

Number 


Calcu- 

lated 

(Da) 


In Vitro Translation Product 
( KDa ) 


Without Membrane 
Svstem Added 


With Membrane 
Svstem Added* 


1 


HP00658 


17,037 


18 


16 


2 


HP00714 


37,106 


4 7 




3 


HP00876 


18,230 


18 




A 
■a 


HP 0113 4 


4 J Q 4 7 


42 


49 


b 


HP10029 


18, 894 


21 


18 


6 


HP10189 


9, 113 


12 




7 


HP10269 


129,572 


130 




8 


HP10298 


13,161 


16 




9 


HP10368 


19,979 


19 


18 



* - means "Not examined" . 



( 7 ) Clone Examples 

<HP00658> (Sequence Number 1, 10, 19) 

Determination of the whole base sequence for the cDNA 
insert of clone HP00658 obtained from the human fibrosarcoma 
cell line HT-1080 cDNA libraries revealed the structure 
consisting of a 5 ' -non-translation region of 55 bp, an ORF of 
465 bp, and a 3 ' -non-translation region of 776 bp. The ORF 
codes for a protein consisting of 154 amino acid residues 
with a hydrophobic region of a putative secretory signal 
sequence at the N-terminal . Figure 3 depicts the 



data base using the amino acid sequence encoded by the ORF 
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revealed that the N-terminal 63 amino acid residues thereof 
were completely identical with those in the RANTES protein 
( EMBL Accession No. 21121) except for one amino acid residue 
at position 7 (arginine in RANTES and alanine in the present 
protein), but the sequences in both proteins were completely 
different after position 64. Hereupon, RANTES consisted of 91 
amino acid residues, whereas the present protein consisted of 
longer 154 amino acid residues. The in vitro translation 
resulted in the formation of a translation product of 18 kDa 
that was almost consistent with the molecular weight of 
17,037 predicted from the ORF. In this case, the addition of 
the microsome resulted in the formation of a 16-kDa product 
in which the secretory signal sequence portion was putatively 
removed by cleavage. This result together with the result on 
pSSD3 verifies that the present protein possesses the 
secretory signal sequence. Application of the (-3,-1) rule, 
a method for predicting the signal sequence cleavage site 
[von Heijne, G., Nucl . Acid Res. 14 : 4683-4690 ( 1986 )], 
allows to expect that the maturation protein starts from 
serine at position 24 . 

Comparison of the base sequences for the both proteins 
revealed that the base sequence from position 2 to position 
325 in the present cDNA was deficient in the RANTES cDNA. It 
is considered that this deficiency resulted in induction of 
a frame shift to form an ORF of a different size . Some 
mutations were observed in other regions, wherein the 



protein [Schall, T. J. ct . , J. Immunol. 141: 1018-1025 
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(1988)], whereas the present cDNA was obtained from the 
fibrosarcoma cells . Accordingly , the present protein is 
considered to possess a different function from that of 
RANTES . 

Furthermore, the search of GenBank using the base 
sequence of the present cDNA revealed that any EST possessing 
the homology of 90% or more was not found. 

<HP00714> (Sequence Number 2, 11, 20) 

Determination of the whole base sequence for the cDNA 
insert of clone HP00714 obtained from the human epidermoid 
carcinoma cell line KB cDNA libraries revealed the structure 
consisting of a 5 '-non-translation region of 56 bp, an ORF of 
948 bp, and a 3 ' -non-translation region of 2310 bp. The ORF 
codes for a protein consisting of 315 amino acid residues 
with a hydrophobic region of a putative secretory signal 
sequence at the N-terminal . Figure 4 depicts the 
hydrophobicity/hydrophilicity profile of the present protein 
obtained by the Kyte-Doolittle method. The in vitro 
translation resulted in the formation of a translation 
product of 4 7 kDa that was somewhat larger than the molecular 
weight of 37,106 predicted from the ORF. Since the molecular 
weight of the human reticulocalbin analogous to the present 
protein is also larger by about 10 kDa than the molecular 
weight expected from the translation-product band on SDS-PAGE 
[Ozawa, M . , J. Biochem. 117: 1113-1119 ( 1995)], the molecular 
weight difference in the present protein is considered to be 



cleavage site, a 1 lows to expect that the maturation protein 
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starts from lysine at position 20 . There is a possibility 
that the present protein exists in the endoplasmic reticulum 
because this protein possesses the C-terminal sequence HDEF 
analogous to KDEL, the signal motif sequence localized in the 
endoplasmic reticulum. 

The search of the protein data base using the amino acid 
sequence of the present protein revealed that the protein was 
analogous to the human reticulocalbin (GenBank Accession No. 
D42073). Table 4 indicates the comparison of the amino acid 
sequences between the human protein of the present invention 
(HP) and the human reticulocalbin (RC). - represents a gap, 
* represents an amino acid residue identical to that in the 
protein of the present invention, and . represents an amino 
acid residue analogous to that in the protein of the present 
invention. The both proteins possessed a homology of 60.5%. 



Table 4 



HP ' MDLRQFLMCLSLCTAFALSKPTEKKDR-VHHEPQLSDKVHNDAQSFDYDH 

. #. * * ^ *** .*.**....* *. *** *** 

RC MARGGRGRRLGLALGLLLALVLAPRVLRAKPTVRKERVVRPDSELGERPPEDNQSFQYDH 
HP DAFLGAEEAKTFDQLTPEESKERLGKIVSKI DGDKDGFVTVDELKDWIKFAQKRWI YEDV 

RC EAFLGKEDSKTFDQLTPDESKERLGKI VDRI DNDGDGFVTTEELKTWI KRVQKRYI FDNV 
HP ERQWKGHDLNEDGLVSWEEYKNATYG YVLDDP DPDDGFN YKQMMVRDERRFKMADK 



UGDl i ATKLBr 'rAr'i.HPt:t YUYMKu i VVQBTMEU i DKNABGF i DLEBY I GDMYSHDGNTL 
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_ ***. **. ***********. **. *** **. ******. ***. * . ***. **. **. . *. 

RC NGDLTATREEFTAFLHPEEFEHMKEI VVLETLEDI DKNGDGFVDQ.DEY I ADMFSHEENGP 
HP EPEWVKTEREQFVEFRDKNRDGKMDKEETKDW I LPSDYDHAEAEARHLV YESDQNKDGKL 



**. ** . ***** **** *. ***. **. *. . . **** *****. ***********. ***. ** 



RC EPDWVLSEREQFNEFRDLNKDGKLDKDE I RHWI LPQDYDHAQAEARHLV YESDKNKDEKL 



HP TKEEI VDKYDLFVGSQATDFGEALVR-HDEP 




RC TKEEI LENWNMFVGSQATNYGEDLTKNHDEL 



Furthermore, the search of GenBank using the base 
sequence of the present cDNA revealed that there existed some 
ESTs possessing the homology of 9 0% or more and containing 
the initiation codon (for example, Accession No. F3872), but 
any of the sequences thereof did not allow to predict the 
present protein. 

Reticulocalbin is a protein localized on the membrane 
surface of the endoplasmic reticulum and has been considered 
to participate in the protein folding. Accordingly, the 
protein of the present invention is considered to be 
applicable to the folding process of recombinant proteins. 

<HP00876> (Sequence Number 3, 12, 21) 

Determination of the whole base sequence for the cDNA 
insert of clone HP087 6 obtained from the human stomach cancer 
cDNA libraries revealed the structure consisting of a 5 ' -non- 
translation region of 146 bp, an ORF of 477 bp, and a 3 ' -non- 
translation region of 529 hn . The ORF codes for a protein 



terminal. Figure 5 depicts the hydrophobicity/hydrophilici ty 
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profile of the present protein obtained by the Kyte-Doolittle 
method. The in vitro translation resulted in the formation of 
a translation product of 18 kDa that was almost consistent 
with the molecular weight of 18,230 predicted from the ORF . 
In this case, the addition of the microsome resulted in the 
formation of a 16-kDa product in which the secretory signal 
sequence portion was putatively removed by cleavage. This 
result together with the result on pSSD3 verifies that the 
present protein possesses the secretory signal . Application 
of the (-3,-1) rule, a method for predicting the signal 
sequence cleavage site, allows to expect that the maturation 
protein starts from glycine at position 18 or aspartic acid 
at position 23 . 

The search of the protein data base using the amino acid 
sequence of the present protein revealed that the protein was 
analogous to several type-C lectins. As an example, Table 5 
indicates the comparison of the amino acid sequences between 
the human protein of the present invention (HP) and the 
rattlesnake lectin (CL) (Swiss-PROT Accession No. P21963). - 
represents a gap , * represents an amino acid residue 
identical to that in the protein of the present invention , 
and . represents an amino acid residue analogous to that in 
the protein of the present invention. The both proteins 
possessed a homology of 35.3%. 
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Table 5 



HP MASRSMRLLLLLSCLAKTGVLGD 1 1 MRPSCAPGWF YHKSNCYG YFRKLRNWSOAELECQ.S 

^* ^^ *^ 

CL NNCPLDWLPMNGLC YK I FNQLKTWEDAEMFCRK 

HP YGNGAHLASILSLKEASTIAEYISGYQRSQ.-PIWI GLHDPQKRQQWQW I DGAMYLYRSWS 
* * ****. * ****** * * **** * * * * * * * 

CL YKPGCHLASFHR YGESLEI AEY I SDYHKGQENVWI GLRDKKKDFSWEWTDRSCTDYLTWD 
HP GKSMGG— NKH-CAEMSSNNNFLTWSSNECNKRQHFLCKYRP 

«X» sL» vL> \L* kLt .i. . i . .»..<-. i > 

CL KNQPDHYQNKEPCVELVSLTGYRLWNDQVCESKDAFLCaCKF 



Furthermore , the search of GenBank using the base 
sequence of the present cDNA revealed that any EST possessing 
the homology of 9 0% or more was not found. 

After 1 ug of the plasmid pHP00876 was digested with 20 
units of PvuII, the product was subjected to 1% agarose gel 
electrophoresis and an about 7 00-bp DNA fragment was cut off 
from the gel. Next, 1 ug of pET-21a (Novagen) was digested 
with 20 units of Nhel, the product was subjected to the 
Klenow treatment followed by 1% agarose gel electrophoresis 
and an about 5.4-kbp DNA fragment was cut off from the gel. 
After ligation of the vector fragment and the cDNA fragment 
using a ligation kit, Escherichia coll BL21 (DE3) (Novagen) 
was transformed. A plasmid pET876 was prepared from the 



vector expresses a protein in which methionine-alanine was 
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inserted before a protein starting from serine at position 29 
in the protein encoded by the clone HP00876. 

A suspension of pET8 7 6/BL2l ( DE3 ) in 5 ml of the LB 
culture medium containing 100 ng/ml ampicillin was incubated 
in a shaker at 37 °C and isopropyl thiogalactoside was added to 
make 1 mM when A 600 reached to about 0.5. After the 
incubation was continued at 37°C for 6 hours, cells were 
collected by centrif ugation and suspended in 25 ml of a 
column buffer solution for the amylose column (10 mM Tris- 
hydrochlor i c acid . pH 7.4. 200 mM NaCl f and 1 mM EDTA) . The 
resulting suspension was sonicated and then the insoluble 
fraction was subjected to SDS-polyacrylamide electrophoresis 
to identify a band originating from the expression of the 
present vector at a position of about 14 kDa . 

Since lectins recognize and then bind to sugar chains, 
lectins are useful as sugar-chain detection reagents and as 
affinity carriers for purification of glycoproteins. In 
addition, extracellular secretory lectins play important 
roles also in intercellular signal transduction and thereby 
are useful as medicines. 

<HP01134> (Sequence Number 4, 13, 22) 

Determination of the whole base sequence for the cDNA 
insert of clone HP01134 obtained from the human liver cDNA 
libraries revealed the structure consisting of a 5 ' -non- 
translation region of 116 bp, an ORF of 1131 bp, and a 3 ' - 
non-translation reoion of 502 br> . The ORF nodes for a protein 

terminal. Figure 6 depicts the hydrophobicity/hydrophilicity 
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profile of the present protein obtained by the Kyte-Doolittle 
method. The in vitro translation resulted in the formation of 
a trans la t ion product of 4 2 kDa that was almost consistent 
with the molecular weight of 42,947 predicted from the ORF . 
In this case, the addition of the microsome resulted in the 
formation of a 49-kDa product in which a sugar chain was 
putatively added by N-glycosylation after the secretion. 
Hereupon, there exist in the amino acid sequence of this 
protein four possible N-glycosylation sites (Asn-Gly-Thr at 
position 91, Asn-Glu-Thr at position 167, Asn-Thr-Ser at 
position 263, and Asn-Lys -Thr at position 272). The above 
result together with the result on pSSD3 verifies that the 
present protein possesses the secretory signal. Application 
of the (-3,-1) rule, a method for predicting the signal 
sequence cleavage site, allows to expect that the maturation 
protein starts from alanine at position 17 or valine at 
position 1 8 . 

The search of the protein data base using the amino acid 
sequence of the present protein revealed that the protein was 
analogous to several cysteine proteinases. As an example, 
Table 6 indicates the comparison of the amino acid sequences 
between the human protein of the present invention ( HP ) and 
the tangerine cysteine proteinase (CP) (GenBank Accession No. 
Z47793 ) . - represents a gap, * represents an amino acid 
residue identical to that in the protein of the present 
invention , and . represents an amino acid residue ana loqous 



region of 286 amino acid residues 




• 
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Table 6 



HP MVWKVAVFLSVALGI GAVPI DDPEDGGKH 

^^^^ 

* * • • 

CP MTRLASGVL I TLLVALAG IADGSRDIAGDILKLPSEAYRFFHNGGGGAKVNDDDDSVGTR 

HP WVVI VAGSNGWYNYRHQADACHAYQI I HRNG I PDEQI VVMMY DD I AYSEDNPTPG I V I NR 
* * _ ***** _ ******* ***** * ** * * ****** _ * ** ** _ ** 

CP WAVLLAGSNGFWNYRHQADI CHAYQLLRKGGLKDENI I VFMYDD I AFNEENPRPGV I INH 

HP PNGTDVYQ.G VPKDYTGEDVTPQNFLAVLRGDAEAVKG I GSGKVLKSGPQDHVF I YFTDHG 
*, *_ *** ************ _ _ *_ **_ *_ * * *****_ _ ***_ ** ** _ _ 

CP PHGDDVYKG VPKDYTGEDVTVEKFFAVVLGNKTALTG-GSGKVVDSGPNDH I FI FYSDHG 
HP STG I LVFPNED-LHVKDLNET I HYMYKHKMYRKMVFY I EACESGSMMN-HLPDNI NVYAT 

..*.*.* * *. . . ***. *******. . . *...*. *** 

CP GPGVLGMPTSRYI YADELI DVLKKKHASGNYKSLVFYLEACESGS I FEGLLLEGLNI YAT 
HP TAANPRESSYACYY — -DEKRSTY — LGDWYSVNWMEDSDVEDLTKETLHKQYHLVKS 

**. *. ***. . *. .... * *** **. . ******. . . * ****. **. ***. 
CP TASNAEESSWGTYCPGEI PGPPPEYSTCLGDLYSI AWMEDSDI HNLRTETLHQQYELVKT 
HP HT NTSHVMQYGNKTISTMKVMQFQGMKRKASSPVPLPPVTHLDLTPSPDVPLTIM 

• ••••••» 

CP RTASYNSYGSHVMQYGDIGLSKNNLFTYLGTNPANDNYTFVDENSLRPASKAVNQRDADL 



Furthermore, the search of GenBank using the base 
sequence of the present cDNA revealed that there existed some 
ESTs possessing the homology of 90% or more (for example, 



identi f ied . 
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Extracellular secretory proteases possess a variety of 
physiological functions and thereby are useful as medicines. 
In addition, the proteases have been utilized as research 
reagents for the structure analysis of proteins by restricted 
degradation and so on . 

<HP10029> (Sequence Number 5, 14, 23) 

Determination of the whole base sequence for the cDNA 
insert of clone HP10029 obtained from the human epidermoid 
carcinoma cell line KB cDNA libraries revealed the structure 
consisting of a 5 ' -non-translation region of 8 bp, an ORF of 
522 bp, and a 3 ' -non-translation region of 458 bp. The ORF 
codes for a protein consisting of 173 amino acid residues 
with a hydrophobic region of a putative secretory signal 
sequence at the N-terminal . Figure 7 depicts the 
hydrophobicity/hydrophilicity profile of the present protein 
obtained by the Kyte-Doolitt le method. The in vitro 
trans la t ion resulted in the formation of a trans la t ion 
product of 21 kDa that was almost consistent with the 
molecular weight of 18,894 predicted from the ORF. In this 
case, the addition of the microsome resulted in the formation 
of a 18-kDa product in which the secretory signal sequence 
portion was putatively removed by cleavage. This result 
together with the result on pSSD3 verifies that the present 
protein possesses the secretory signal sequence. Application 
of the (-3,-1) rule, a method for predicting the signal 
seauence cleavaae site, allows to expect that the maturation 



endoplasmic reticulum because this protein possesses the C- 
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terminal sequence RTEL analogous to KDEL, the signal motif 
sequence localized in the endoplasmic reticulum. 

The search of the protein data base using the amino acid 
sequence of the present protein revealed that the protein was 
not homologous with any of known proteins. Hereupon, the 
search of GenBank using the base sequence revealed that there 
existed some ESTs possessing the homology of 90% or more (for 
example, Accession No. H87021), but they were shorter than 
the present cDNA and any molecule containing the initiation 
codon was not identified. 

<HP10189> (Sequence Number 6, 15, 24) 

Determination of the whole base sequence for the cDNA 
insert of clone HP10189 obtained from the human epidermoid 
carcinoma cell line KB cDNA libraries revealed the structure 
consisting of a 5 ' -non-translation region of 101 bp, an ORF 
of 222 bp, and a 3 ' -non-translation region of 67 bp. The ORF 
codes for a protein consisting of 73 amino acid residues with 
a hydrophobic region of a putative secretory signal sequence 
at the N-terminal. Figure 8 depicts the 

hydrophobicity /hydrophilicity profile of the present protein 
obtained by the Kyte-Dool i t t le method. The in vitro 
translation resulted in the formation of a translation 
product of 10 kDa that was almost consistent with the 
molecular weight of 9,113 predicted from the ORF. Application 
of the (-3,-1) rule, a method for predicting the signal 
sequence cleavage site, allows to expect that the maturation 



sequence of the present protein revealed that, the protein was 
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not homologous with any of known proteins. Hereupon, the 
search of GenBank using the base sequence revealed that there 
existed some ESTs possessing the homology of 90% or more and 
containing the initiation codon (for example, Accession No. 
N56270), but a frame shift had occurred and the same ORF as 
that in the present cDNA was not identified. 
<HP10269> (Sequence Number 7 , 16, 25) 

Determination of the whole base sequence for the cDNA 
insert of clone HP10269 obtained from the human lymphoma cell 
line U937 cDNA libraries revealed the structure consisting of 
a 5 ' -non-translation region of 753 bp, an ORF of 351 bp, and 
a 3 ' -non- translation region of 395 bp . The ORF codes for a 
protein consisting of 1172 amino acid residues with a 
hydrophobic region of a putative secretory signal sequence at 
the N-terminal. Figure 9 depicts the 
hydrophobicity/hydrophilicity profile of the present protein 
obtained by the Kyte-Doolitt le method. The in vitro 
translation resulted in the formation of a translation 
product of 13 0 kDa that was almost consistent with the 
molecular weight of 129,571 predicted from the ORF. 
Application of the (-3,-1) rule, a method for predicting the 
signal sequence cleavage site, allows to expect that the 
maturation protein starts from glutamine at position 18. 

The search of the protein data base using the amino acid 
sequence of the present protein revealed that the protein was 
analogous to the B3 chain of laminin S. Table 7 indicates the 



human laminin S ( B3 ) ( GenBank Accession No . 



L25541 ) 
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Table 7 



Amino Acid Residue Number 


HP 


B3 


124 


Gin 


Arg 


269 


Pro 


Deficient 


388 


Pro 


Ala 


426 


Gin 


Arg 


427 


Gly 


Arg 


439 


Arg 


Deficient 


441 


Asp 


Glu 


603 


Aro 


Pro 


815 


Gly 


Ala 



Comparison of the base sequence of the present cDNA and 
the base sequence described in the data base reveals that the 
5 '-terminus in the present cDNA is longer by 600 or more bp 
and the 81-bp 5 ' -terminus in the base sequence described in 
the data base is not consistent at ail with the base sequence 
of the present cDNA. Accordingly, the both proteins originate 
from different mRNAs . 

As an extracellular matrix, laminin deeply participates 
in the proliferation and differentiation of cells. 

Accordingly, laminin has been employed as an additive for the 
cell culture and so on . 

<HP10298> (Sequence Number 8, 17, 26) 

Determination of the whole base sequence for the cDNA 



5 ' -non-translation region of 137 bp, an ORF of 369 bp, and a 





WO 98/11217 PCT/JP97/03239 



34 

3 ' -non-translation region of 580 bp. The ORF codes for a 
protein consisting of 122 amino acid residues with a 
hydrophobic region of a putative secretory signal sequence at 
the N-terminal. Figure 10 depicts the 

hydrophobicity/hydrophilicity profile of the present protein 
obtained by the Kyte-Doolittle method. The in vitro 
translation resulted in the formation of a translation 
product of 16 kDa that was almost consistent with the 
molecular weight of 13,161 predicted from the ORF. 
Application of the (-3,-1) rule, a method for predicting the 
signal sequence cleavage site, allows to expect that the 
maturation protein starts from leucine at position 18. There 
is also a possibility that the present protein possessing the 
hydrophobic C-terminal sequence of about 20 amino acid 
residues binds to the membrane via this portion. 

The search of the protein data base using the amino acid 
sequence of the present protein revealed that the protein was 
not homologous with any of known proteins. Hereupon, the 
search of GenBank using the base sequence revealed that there 
existed some ESTs possessing the homology of 90% or more and 
containing the initiation codon (for example, Accession No. 
D78655), but many sequences were not distinct and the same 
ORF as that in the present cDNA was not identified. 

<HP10368> (Sequence Number 9, 18, 27) 

Determination of the whole base sequence for the cDNA 
insert of clone HP10368 obtained from the human stomach 



3 ' -non-translation region of 26 6 bp. The ORF codes for a 
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protein consisting of 17 5 amino acid residues with a 
hydrophobic region of a putative secretory signal sequence at 
the N-terminal. Figure 11 depicts the 

hydrophobicity/hydrophilicity profile of the present protein 
obtained by the Kyte-Doolittle method. The in vitro 
translation resulted in the formation of a translation 
product of 20 kDa that was almost consistent with the 
molecular weight of 19,979 predicted from the ORF . In this 
case, the addition of the microsome resulted in the formation 
of a 1 9 — k Da product i n wh ich the secretory signal sequence 
portion was putatively removed by cleavage. This result 
together with the result on pSSD3 verifies that the present 
protein possesses the secretory signal. Application of the (- 
3,-1) rule, a method for predicting the signal sequence 
cleavage site, allows to expect that the maturation protein 
starts from leucine at position 19 or arginine at position 
21. There is a possibility that the present protein exists in 
the endoplasmic reticulum because this protein possesses the 
C-terrainal sequence KTEL analogous to KDEL, the signal motif 
sequence localized in the endoplasmic reticulum . 

The search of the protein data base using the amino acid 
sequence of the present protein revealed that the protein was 
not homologous with any of known proteins. Hereupon, the 
search of GenBank using the base sequence revealed that there 
existed some ESTs possessing the homology of 90% or more and 
containina the initiation codon f for example, Accession No. 



WO 98/11217 



PCT/JP97/03239 



36 

INDUSTRIAL APPLICATION 

The present invention provides human proteins having 
secretory signal sequences and cDNAs encoding said proteins. 
All of the proteins of the present invention are putative 
proteins controlling the proliferation and differentiation of 
the cells, because said proteins are secreted outside the 
cells and exist in the extracellular liquid or on the cell 
membrane surface. Therefore, the proteins of the present 
invention can be used as pharmaceuticals or as antigens for 
orenarina antibodies aaainst said Droteins . Furthermore, said 
DNAs can be used for the expression of large amounts of said 
proteins . 

In addition to the activities and uses described above , 
the polynucleotides and proteins of the present invention may 
exhibit one or more of the uses or biological activities 
( including those associated with as says cited herein ) 
identified below. Uses or activities described for proteins 
of the present invention may be provided by administration or 
use of such proteins or by administration or use of 
polynucleotides encoding such proteins (such as, for example, 
in gene therapies or vectors suitable for introduction of 
DNA) . 

Research Uses and Utilities 

The polynucleotides provided by the present invention can 
be used by the research community for various purposes. The 
po 1 vnuc 1 eot ides can be used to express recombinant protein 



preferentially expressed (either constitutively or at a 
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particular stage of tissue differentiation or development or 
in disease states); as molecular weight markers on Southern 
gels ; as chromosome markers or tags ( when labeled ) to 
identify chromosomes or to map related gene positions; to 
compare with endogenous DNA sequences in patients to identify 
potential genetic disorders; as probes to hybridize and thus 
discover novel , related DNA sequences ; as a source of 
information to derive PCR primers for genetic fingerprinting; 
as a probe to " subtract-out " known sequences in the process 
of discovering other novel polynucleotides; for selecting and 
making oligomers for attachment to a " gene chip " or other 
support, including for examination of expression patterns; to 
raise anti-protein antibodiesusing DNA immunization 
techniques; and as an antigen to raise anti-DNA antibodies or 
elicit another immune response. Where the polynucleotide 
encodes a protein which binds or potentially binds to another 
protein ( such as , for example , in a receptor-ligand 
interaction), the polynucleotide can also be used in 
interaction trap assays (such as, for example, that described 
in Gyuris et al . , Cell 75 : 791-803 ( 1993 )) to identify 
polynucleotides encoding the other protein with which binding 
occurs or to identify inhibitors of the binding interaction. 

The proteins provided by the present invention can 
similarly be used in assay to determine biological activity, 
including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune 



protein (or its receptor) in biological fluids; as markers 
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for tissues in which the corresponding protein is 
preferentially expressed (either consti tutively or at a 
particular stage of tissue differentiation or development or 



in a disease state); and, of course, to isolate correlative 
receptors or ligands . Where the protein binds or potentially 
binds to another protein (such as, for example, in a 
receptor-ligand interaction), the protein can be used to 
identify the other protein with which binding occurs or to 
identify inhibitors of the binding interaction . Proteins 
involved in these binding interactions can also be used to 
screen for peptide or small molecule inhibitors or agonists 
of the binding interaction. 

Any or all of these research utilities are capable of 
being developed into reagent grade or kit format for 
commercialization as research products. 

Methods for performing the uses listed above are well 
known to those skilled in the art. References disclosing 
such methods include without limitation "Molecular Cloning: 
A Laboratory Manual 2d ed . , Cold Spring Harbor Laboratory 
Press, Sambrook, J., E.F. Fritsch and T. Maniatis eds . , 1989 , 
and "Methods in Enzymology : Guide to Molecular Cloning 
Techniques " , Academic Press , Berger , S.L. and A.R. Kimmel 
eds . , 1987 . 

Nutritional Uses 

Polynucleotides and proteins of the present invention can 
also be used as nutritional sources or supplements . Such 

source) and use as a source of carbohydrate. In such cases 
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the protein or polynucleotide of the invention can be added 
to the feed of a particular organism or can be administered 
as a separate solid or 1 iquid preparation , such as in the 
form of powder, pills, solutions, suspensions or capsules. 
In the case of microorganisms, the protein or polynucleotide 
of the invention can be added to the medium in or on which 
the microorganism is cultured. 

Cytokine and Cell Proliferation/Pi f ferentiationActivity 

A protein of the present invention may exhibit cytokine, 
cell proliferation ( either inducing or inhibiting ) or eel 1 
differentiation (either inducing or inhibiting) activity or 
may induce production of other cytokines in certain cell 
populations . Many protein factors discovered to date , 
including all known cytokines, have exhibited activity in one 
or more factor dependent cell proliferation assays, and hence 
the assays serve as a convenient confirmation of cytokine 
activity. The activity of a protein of the present invention 
is evidenced by any one of a number of routine factor 
dependent cell proliferation assays for cell lines including, 
without limitation, 32D, DA2 , DA1G , T10, B9 , B9/11, BaF3, 
MC9/G, M+ (preB M+ ) , 2E8, RB5 , DAI, 123, T1165, HT2 , CTLL2 , 
TF-1, Mo7e and CMK . 

The activity of a protein of the invention may , among 
other means, be measured by the following methods: 

Assays for T-cell or thymocyte proliferation include 
without limitation those described in: Current Protocols in 



Associates and Wiley-Interscience (Chapter 3, In Vitro assays 
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for Mouse Lymphocyte Function 3.1-3.19; Chapter 7 , 

Immunologic studies in Humans); Takai et al . , J. Immunol. 

137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 

145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 

133:327-341, 1991; Bertagnolli, et al . , J. Immunol. 

149:3778-3783, 1992; Bowman et al., J. Immunol. 152: 

1756-1761, 1994. 

Assays for cytokine production and/or proliferation of 
spleen cells, lymph node cells or thymocytes include, without 

limitation , those described in : Po lyclonal T cell 
stimulation, Kruisbeek, A.M. and Shevach, E.M. In Current 
Protocols in Immunology. J.E.e.a. Coligan eds . Vol 1 pp. 
3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and 
Measurement of mouse and human Interferon y r Schreiber, R.D. 
In Current Protocols in Immunology. J.E.e.a. Coligan eds. Vol 
1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of 
hematopoietic and lymphopoietic cells include, without 
limitation, those described in: Measurement of Human and 
Murine Interleukin 2 and Interleukin 4, Bottomly, K., Davis, 
L.S. and Lipsky, P.E. In Current Protocols in Immunology. 
J.E.e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
Sons, Toronto. 1991; deVries et al . , J. Exp. Med. 
173 : 1205-1211 , 1991 ; Moreau et al . , Nature 336 : 690-692, 1988 ; 
Greenberger et al. , Proc . Natl. Acad. Sci . U.S.A. 
80:2931-2938, 1983; Measurement of mouse and human 



Sons, Toronto. 1991; Smith et al . , Proc. Natl. Acad. Sci. 
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U.S.A. 8 3:1857-1861, 1986; Measurement, of human Interleukin 
11 - Bennett, F., Giannotti, J., Clark, S.C. and Turner, K. 
J. In Current Protocols in Immunology. J.E.e.a. Coligan eds . 
Vol 1 pp. 6.15.1 John Wiley and Sons, Toronto. 1991; 
Measurement of mouse and human Interleukin 9 - Ciarletta, A., 
Giannotti, J., Clark, S.C. and Turner, K.J. In Current 
Protocols in Immunology. J.E.e.a. Coligan eds. Vol 1 pp. 
6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will 
identify, among others, proteins that affect APC-T cell 
interactions as well as direct T-cell effects by measuring 
proliferation and cytokine production) include, without 
limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A.M. Kruisbeek, D.H. 
Margulies, E.M. Shevach, W Strober, Pub. Greene Publishing 
Associates and Wiley- Interscience (Chapter 3, In Vitro assays 
for Mouse Lymphocyte Function; Chapter 6, Cytokines and their 
cellular receptors; Chapter 7, Immunologic studies in 
Humans); Weinberger et al . , Proc . Natl. Acad. Sci . USA 
77:6091-6095, 1980; Weinberger et al., Eur. J. Immun. 
11 : 405-411, 1981; Takai et al . , J. Immunol. 13 7 : 3494-3500 , 
1986; Takai et al., J. Immunol. 140:508-512, 1988. 
Immune Stimulating or Suppressing Activity 
A protein of the present invention may also exhibit 
immune stimulating or immune suppressing activity, including 
without limitation the activities for which assays are 

severe combined immunodeficiency (SCID)), e.g., in regulating 
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(up or down) growth and proliferation of T and/or B 
lymphocytes, as well as effecting the cytolytic activity of 
NK cells and other cell populations . These immune 
deficiencies may be genetic or be caused by viral (e.g., HIV) 
as well as bacterial orfungal infections, or may result from 
autoimmune disorders. More specifically, infectious diseases 
causes by viral, bacterial, fungal or other infection may be 
treatable using a protein of the present invention , 
including infections by HIV, hepatitis viruses, 
herpesviruses , mycobacteria , Leishmania spp . , malaria spp . 
and various fungal infections such as candidiasis . Of 
course, in this regard, a protein of the present invention 
may also be useful where a boost to the immune system 
generally may be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein 
of the present invention include, for example, connective 
tissue disease , multiple sclerosis , systemic lupus 
erythematosus , rheumatoid arthritis , autoimmune pulmonary 
inflammation , Gui 11a in -Bar re syndrome , autoimmune 

thyroiditis, insulin dependent diabetes mellitis, myasthenia 
gravis, graf t-versus-host disease and autoimmune inflammatory 
eye disease. Such a protein of the present invention may 
also to be useful in the treatment of allergic reactions and 
conditions , such as asthma ( particularly allergic asthma ) or 
other respiratory problems. Other conditions, in which 
immune suppression is desired ( including , for example , organ 



Using the proteins of the invention it may also be 
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possible to immune responses, in a number of ways. Down 
regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing 
the induction of an immune response . The functions of 
activated T cells may be inhibited by suppressing T cell 
responses or by inducing specific tolerance in T cells, or 
both. Immunosuppression of T cell responses is generally an 
active , non-antigen-specific , process which requires 
continuous exposure of the T cells to the suppressive agent. 
Tolerance, which involves inducing non-responsiveness or 
anergy in T cells, is distinguishable from immunosuppression 
in that it is generally antigen-specific and persists after 
exposure to the tolerizing agent has ceased. Operationally, 
tolerance can be demonstrated by the lack of a T cell 
response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen 
functions (including without limitation B lymphocyte antigen 
functions (such as , for example, B7 ) ) , e.g., preventing high 
level lymphokine synthesis by activated T cells, will be 
useful in situations of tissue, skin and organ 
transplantation and in graf t-versus-host disease ( GVHD ) . For 
example, blockage of T cell function should result in reduced 
tissue destruction in tissue transplantation. Typically, in 
tissue transplants, rejection of the transplant is initiated 
through its recognition as foreign by T cells, followed by an 

interaction of a B7 lymphocyte antigen with its natural 





WO 98/11217 PCT/JP97/03239 



44 

ligand(s) on immune cells (such as a soluble, monomeric form 
of a peptide having B7-2 activity alone or in conjunction 
with a monomeric form of a peptide having an activity of 
another B lymphocyte antigen (e.g., B7-1, B7-3) or blocking 
antibody), prior to transplantation can lead to the binding 
of the molecule to the natural ligand(s) on the immune cells 
without transmitting the corresponding cos timulatory signal. 
Blocking B lymphocyte antigen function in this matter 
prevents cytokine synthesis by immune cells, such as T cells, 
and thus acts as an i mmunosuppressant . Moreover, the lack of 
costimulation may also be sufficient to anergize the T cells, 
thereby inducing tolerance in a subject. Induction of 
long-term tolerance by B lymphocyte antigen-blocking reagents 
may avoid the necessity of repeated administration of these 
blocking reagents. To achieve sufficient immunosuppression 
or tolerance in a subject, it may also be necessary to block 
the function of a combination of B lymphocyte antigens. 

The efficacy of particular blocking reagents in 
preventing organ transplant rejection or GVHD can be assessed 
using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used 
include allogeneic cardiac grafts in rats and xenogeneic 
pancreatic islet cell grafts in mice, both of which have been 
used to examine the immunosuppressive effects of CTLA4 Ig 
fusion proteins in vivo as described in Lenschow et al., 
Q^iop.r-o oq7:7R9-79? M997> and Turka et al., Proc . Natl. 



-i i 1 1 i ■ 



Press, New York, 1989, pp. 846-847) can be used to determine 
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the effect of blocking B lymphocyte antigen function in vivo 
on the development of that disease. 

Blocking antigen function may also be therapeutically 
useful for treating autoimmune diseases . Many autoimmune 
disorders are the result of inappropriate activation of T 
cells that are reactive against self tissue and which promote 
the production of cytokines and autoantibodies involved in 
the pathology of the diseases. Preventing the activation of 
autoreactive T cells may reduce or eliminate disease 
symptoms. Administration of reagents which block 

costimulation of T cells by disrupting receptor : ligand 
interactions of B lymphocyte antigens can be used to inhibit 
T cell activation and prevent production of autoantibodies or 
T cell-derived cytokines which may be involved in the disease 
process. Additionally, blocking reagents may induce 

antigen-specific tolerance of autoreactive T cells which 
could lead to long-term relief from the disease . The 
efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of 
well-characterized animal models of human autoimmune 
diseases . Examples include murine experimental autoimmune 
encephalitis, systemic lupus erythmatosis in MRL/lpr/lpr mice 
or NZB hybrid mice, murine autoimmune collagen arthritis, 
diabetes mellitus in NOD mice and BB rats, and murine 
experimental myasthenia gravis (see Paul ed . , Fundamental 
Immunology, Raven Press, New York, 1989, pp. 840-856). 



immune responses, may also be 



useful 



in 



therapy . 
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Upregulation of immune responses may be in the form of 
enhancing an existing immune response or eliciting an initial 
immune response . For example , enhancing an immune response 
through stimulating B lymphocyte antigen function may be 
useful in cases of viral infection. In addition, systemic 
viral diseases such as influenza, the commoncold, and 
encephalitis might be alleviated by the administration of 
stimulatory forms of B lymphocyte 
antigens systemically . 

Alternatively , ant i -viral immune responses may be 
enhanced in an infected patient by removing T cells from the 
patient, cos timulating the T cells in vitro with viral 
antigen-pulsed APCs either expressing a peptide of the 
present invention or together with a stimulatory form of a 
soluble peptide of the present invention and reintroducing 
the in vitro activated T cells into the patient. Another 
method of enhancing anti-viral immune responses would be to 
isolate infected cells from a patient, transfect them with a 
nucleic acid encoding a protein of the present invention as 
described herein such that the cells express all or a portion 
of the protein on their surface, and reintroduce the 
transfected cells into the patient. The infected cells would 
now be capable of delivering a cos timulatory signal to, and 
thereby activate, T cells in vivo. 

In another appl ica t ion , up regulation or enhancement of 
antigen function (preferably B lymphocyte antigen function) 



M [ I r 



neuroblastoma, carcinoma) transfected with a nucleic acid 
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encoding at least one peptide of the present invention can be 
administered to a subject to overcome tumor-specific 
tolerance in the subject. If desired, the tumor cell can be 
transfected to express a combination of peptides. For 
example, tumor cells obtained from a patient can be 
transfected ex vivo with an expression vector directing the 
expression of a peptide having B7-2-like activity alone, or 
in conjunction with a peptide having B7-l-like activity 
and/or B7-3-like activity. The transfected tumor cells are 
returned to the patient to result in expression of the 
peptides on the surface of the transfected cell. 
Alternatively, gene therapy techniques can be used to target 
a tumor cell for transf ection in vivo . 

The presence of the peptide of the present invention 
having the activity of a B lymphocyte antigen ( s ) on the 
surface of the tumor cell provides the necessary 
costimulation signal to T cells to induce a T cell mediated 
immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II 
molecules, or which fail to reexpress sufficient amounts of 
MHC class I or MHC class II molecules, can be transfected 
with nucleic acid encoding all or a portion of (e.g., a 
cytoplasmic-domain truncated portion) of an MHC class I ct 
chain protein and fi 2 microglobulin protein or an MHC class 

I let chain protein and an MHC class 110 chain protein to 
thereby express MHC class T or MHC class IT proteins on the 



a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
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cell mediated immune response against the transfected tumor 
cell . Optionally , a gene encoding an an ti sense construct 
which blocks expression of an MHC class II associated 
protein, such as the invariant chain, can also be 
cotransf ected with a DNA encoding a peptide having the 
activity of a B lymphocyte antigen to promote presentation of 
tumor associated antigens and induce tumor specific immunity. 
Thus, the induction of a T cell mediated immune response in 
a human subject may be sufficient to overcome tumor-specific 
tolerance in the subject . 

The activity of a protein of the invention may, among 
other means , be measured by the following methods : 

Suitable assays for thymocyte or splenocyte cytotoxicity 
include, without limitation, those described in: Current 
Protocols in Immunology, Ed by J . E. Coligan, A.M. Kruisbeek, 
D . H. Margulies , E . M . Shevach , W Strober , Pub . Greene 
Publishing Associates and Wiley- Interscience (Chapter 3 , In 
Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 
7, Immunologic studies in Humans); Herrmann et al., Proc. 
Natl. Acad. Sci. USA 7 8:2488-2492, 1981; Herrmann et al., J. 
Immunol . 128:1968-1974, 1982 ; Handa et al . , J . Immunol . 
135 : 1564-1572, 1985 ; Takai et al . , J. Immunol. 137:3494-3500, 
1986 ; Takai et al., J . Immunol. 140:508-512, 1988; Herrmann 
et al . , Proc . Natl . Acad . Sci . USA 78:2488-2492, 1981 ; 
Herrmann et al . , J. Immunol. 128 : 1968-1974 , 1982 ; Handa et 
al., J. Immunol. 135 : 1564-1572 , 1985 ; Takai et al . , J. 



Bertagnolli et al., Cellular Immunology 133:32 7-341, 1991; 
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Brown et al., J. Immunol. 153:307 9-3092, 1994. 

Assays for T-cel 1 -dependent immunoglobulin responses and 
isotype switching (which will identify, among others, 
proteins that modulate T-cell dependent antibody responses 
and that affect Thl/Th2 profiles) include, without 
limitation, those described in: Maliszewski, J. Immunol. 
144:3028-3033, 1990; and Assays for B cell function: In vitro 
antibody production, Mond , J.J. and Brunswick, M. In Current 
Protocols in Immunology. J.E.e.a. Coligan eds . Vol 1 pp. 
3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will 
identify, among others, proteins that generate predominantly 
Thl and CTL responses) include, without limitation, those 
described in: Current Protocols in Immunology, Ed by J. E. 
Coligan, A.M. Kruisbeek, D.H. Margulies, E.M. Shevach, W 
Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies 
in Humans); Takai et al . , J. Immunol. 137 : 3494-3500, 1986 ; 
Takai et al . , J. Immunol. 140:508-512, 1988; Bertagnolli et 
al., J. Immunol. 149:3778-3783, 1992. 

Dendritic eel 1 -dependent assays (which will identify, 
among others, proteins expressed by dendritic cells that 
activate naive T-cells) include, without limitation, those 
described in: Guery et al . , J. Immunol. 134:536-544, 1995; 
Inaba et al., Journal of Experimental Medicine 173:549-559, 

182:255-260, 1995; Nair et al. , Journal of Virology 
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67:4062-4069 , 1993; Huang et al . , Science 264 : 961-965 , 1994 ; 
Macatonia et al., Journal of Experimental Medicine 
169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical 
Investigation 94:797-807, 1994; and Inaba et al., Journal of 
Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival /apoptosis (which will 
identify, among others, proteins that prevent apoptosis after 
superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: 
Darzynkiewicz et al . , Cytometry 13:795-808 , 1992 ; Gorczyca et 
al., Leukemia 7:659-670, 1993; Gorczyca et al . , Cancer 
Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 
1991; Zacharchuk, Journal of Immunology 14 5:4037-4045, 1990; 
Zamai et al . , Cytometry 14:891-897 , 1993 ; Gorczyca et al . , 
International Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell 
commitment and development include , without limitation, those 
described in: Antica et al . , Blood 84 : 111-117 , 1994 ; Fine et 
al., Cellular Immunology 155:111-122, 1994; Galy et al., 
Blood 85:2770-2778, 1995; Toki et al . , Proc . Nat. Acad Sci . 
USA 88:7548-7551, 1991. 

Hema topples is Regulating Activity 

A protein of the present invention may be useful in 
regulation of hematopoiesis and, consequently, in the 
treatment of myeloid or lymphoid cell deficiencies . Even 
marginal biological activity in support of colony forming 

and proliferation of erythroid progenitor cells alone or in 
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combination with other cytokines, thereby indicating utility, 
for example , in treating various anemias or for use in 
conjunction with irradiation/chemotherapy to stimulate the 
production of erythroid precursors and/or erythroid cells; in 
supporting the growth and proliferation of myeloid cells such 
as granulocytes and monocytes /macrophages (i.e., traditional 
CSF activity) useful, for example, in conjunction with 
chemotherapy to prevent or treat consequent 
myelo-suppression ; in supporting the growth and proliferation 
of megakaryocytes and consequently of platelets thereby 
allowing prevention or treatment of various platelet 
disorders such as thrombocytopenia, and generally for use in 
place of or complimentary to platelet transfusions; and/or in 
supporting the growth and proliferation of hematopoietic stem 
cells which are capable of maturing to any and all of the 
above-mentioned hematopoietic cells and therefore find 
therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, 
without limitation, aplastic anemia and paroxysmal nocturnal 
hemoglobinuria), as well as in repopulating the stem cell 
compartment post irradiation/chemotherapy, either in-vivo or 
ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor eel 1 
transplantation (homologous or heterologous)) as normal cells 
or genetically manipulated for gene therapy. 

The activity of a protein of the invention may, among 



various hematopoietic lines are cited above. 
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Assays for embryonic stem cell differentiation (which 
will identify, among others, proteins that influence 
embryonic differentiation hematopoiesis ) include, without 
limitation, those described in: Johansson et al . Cellular 
Biology 15 : 141-151 , 1995; Keller et al. , Molecular and 
Cellular Biology 13:473-486, 1993; McClanahan et al . , Blood 
81 : 2903-2915 , 1993 . 

Assays for stem cell survival and differentiation (which 
wil 1 identify , among others , proteins that regulate 
1 ympho-hematopoiesis ) include, without limitation, those 
described in: Methylcellulose colony forming assays, 
Freshney, M.G. In Culture of Hematopoietic Cells. R.I. 
Freshney, et al. eds . Vol pp. 265-268, Wiley-Liss, Inc., New 
York, NY. 19 94; Hirayama et al . , Proc . Natl. Acad. Sci . USA 
89:5907-5911, 1992; Primitive hematopoietic colony forming 
cells with high proliferative potential, McNiece, I.K. and 
Briddell, R.A. In Culture of Hematopoietic Cells. R.I. 
Freshney, et al . eds. Vol pp. 23-39, Wiley-Liss, Inc., New 
York, NY. 1994 ; Neben et al . , Experimental Hematology 
22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R.E. In Culture of Hematopoietic Cells. R.I. 
Freshney, et al . eds . Vol pp . 1-21 , Wiley-Liss , Inc . . , New 
York, NY. 1994 ; Long term bone marrow cultures in the 
presence of stromal cells, Spooncer, E., Dexter, M. and 
Allen, T. In Culture of Hematopoietic Cells. R.I. Freshney, 
et a 1 . eds . Vol pp . 16 3-179, Wiley-Liss , Inc. , New York , NY . 

eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, NY. 1994. 
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Tissue Growth Activity 

A protein of the present invention also may have utility 
in compositions used for bone, cartilage, tendon, ligament 
and/or nerve tissue growth or regeneration, as well as for 
wound healing and tissue repair and replacement, and in the 
treatment of burns, incisions and ulcers. 

A protein of the present invention, which induces 
cartilage and/or bone growth in circumstances where bone is 
not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other 
animals . Such a preparation employing a protein of the 
invention may have prophylactic use in closed as well as open 
fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an 
osteogenic agent contributes to the repair of congenital, 
trauma induced , or oncologic resection induced craniofacial 
defects, and also is useful in cosmetic plastic surgery. 

A protein of this invention may also be used in the 
treatment of periodontal disease, and in other tooth repair 
processes . Such agents may provide an environment to attract 
bone-forming cells, stimulate growth of bone-forming cells or 
induce differentiation of progenitors of bone-forming cells. 
A protein of the invention may also be useful in the 
treatment of osteoporosis or osteoarthritis, such as through 
stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase 



Another category of tissue? regeneration activity that may 
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be attributable to the protein of the present invention is 
tendon/ligament formation. A protein of the present 

invention, which induces tendon/ligament-like tissue or other 
tissue formation in circumstances where such tissue is not 
normally formed, has application in the healing of tendon or 
ligament tears, deformities and other tendon or ligament 
defects in humans and other animals. Such a preparation 
employing a tendon/ligament-like tissue inducing protein may 
have prophylactic use in preventing damage to tendon or 
ligament tissue, as well as use in the improved fixation of 
tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo 

tendon/ligament-like tissue formation induced by a 
composition of the present invention contributes to the 
repair of congenital, trauma induced, or other tendon or 
ligament defects of other origin, and is also useful in 
cosmetic plastic surgery for attachment or repair of tendons 
or ligaments . The compositions of the present invention may 
provide an environment to attract tendon- or ligament-forming 
cells, stimulate growth of tendon- or ligament-forming cells, 
induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon / 1 igament 
cells or progenitors ex vivo for return in vivo to effect 
tissue repair. The compositions of the invention may also be 
useful in the treatment of tendinitis, carpal tunnel syndrome 
and other tendon or ligament defects. The compositions may 



The protein of the present invention may also be useful 
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for proliferation of neural cells and for regeneration of 
nerve and brain tissue, i.e. for the treatment of central and 
peripheral nervous system diseases and neuropathies, as well 
as mechanical and traumatic disorders, which involve 
degeneration, death or trauma to neural cells or nerve 
tissue. More specifically, a protein may be used in the 
treatment of diseases of the peripheral nervous system , such 
as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, 
such as Alzheimer's, Parkinson's disease, Huntington's 
disease, amyotrophic lateral sclerosis, and Shy-Drager 
syndrome. Further conditions which may be treated in 
accordance with the present invention include mechanical and 
traumatic disorders, such as spinal cord disorders, head 
trauma and cerebrovascular diseases such as stroke. 
Peripheral neuropathies resulting from chemotherapy or other 
medical therapies may also be treatable using a protein of 
the invention . 

Proteins of the invention may also be useful to promote 
better or faster closure of non-healing wounds, including 
without limitation pressure ulcers, ulcers associated with 
vascular insufficiency, surgical and traumatic wounds, and 
the like. 

It is expected that a protein of the present invention 
may also exhibit activity for generation or regeneration of 
other tissues, such as organs (including, for example, 



vascular endothelium) tissue, or for promoting the growth of 
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cells comprising such tissues. Part of the desired effects 
may be by inhibition or modulation of fibrotic scarring to 



invention may also exhibit angiogenic activity. 

A protein of the present invention may also be useful for 
gut protection or regeneration and treatment of lung or liver 
fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A protein of the present invention may also be useful for 
promoting or inhibiting differentiation of tissues described 
above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

The activity of a protein of the invention may, among 
other means, be measured by the following methods: 

Assays for tissue generation activity include, without 
limitation , those described in : International Patent 
Publication No. WO95/16035 (bone, cartilage, tendon); 
International Patent Publication No. WO95/05846 (nerve, 
neuronal); International Patent Publication No. WO91/07491 
(skin, endothelium ). 

Assays for wound healing activity include, without 
limitation, those described in: Winter, Epidermal Wound 
Healing, pps . 71-112 (Maibach, HI and Rovee , DT, eds . ) , Year 
Book Medical Publishers, Inc., Chicago, as modified by 
Eaglstein and Mertz, J. Invest. Dermatol 71:382-84 (1978). 

Activin/ Inhibin Activity 



allow normal tissue to regenerate. 



A protein of the 



characterized by their ability to inhibit the release of 
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follicle stimulating hormone ( FSH ) , while activins and are 
characterized by their ability to stimulate the release of 
follicle stimulating hormone (FSH). Thus, a protein of the 
present invention, alone or in heterodimers with a member of 
the inhibin oc family, may be useful as a contraceptive based 
on the ability of inhibins to decrease fertility in female 
mammals and decrease spermatogenesis in male mammals. 
Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the 
protein of the invention, as a homodimer or as a heterodimer 
with other protein subunits of the inhibin-|3 group, may be 
useful as a fertility inducing therapeutic, based upon the 
ability of activin molecules in stimulating FSH release from 
cells of the anterior pituitary. See, for example, United 
States Patent 4,798,885. A protein of the invention may also 
be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime 
reproductive performance of domestic animals such as cows, 
sheep and pigs . 

The activity of a protein of the invention may, among 
other means, be measured by the following methods: 

Assays for activin/inhibin activity include, without 
limitation, those described in: Vale et al . , Endocrinology 
91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale 
et al., Nature 321:776-779, 1986; Mason et al., Nature 
318:659-663, 1985; Forage et al., Proc . Natl. Acad. Sci . USA 



A protein of the present invention may have chemoLactic 
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or chemokinetic activity (e.g., act as a chemokine) for 
mammalian cells , including , for example , monocytes , 
fibroblasts, neutrophils, T-cells, mast cells, eosinophils, 
epithelial and/ or endothelial cells . Chemotactic and 

chemokinetic proteins can be used to mobi 1 i ze or attract a 
desired cell population to a desired site of action. 
Chemotactic or chemokinetic proteins provide particular 
advantages in treatment of wounds and other trauma to 
tissues, as well as in treatment of localized infections. 
For example, attraction of lymphocytes, monocytes or 
neutrophils to tumors or sites of infection may result in 
improved immune responses against the tumor or infecting 
agent , 

A protein or peptide has chemotactic activity for a 
particular cell population if it can stimulate , directly or 
indirectly, the directed orientation or movement of such cell 
population . Preferably , the protein or peptide has the 
abi li ty to directly stimulate directed movement of cells . 
Whether a particular protein has chemotactic activity for a 
population of cells can be readily determined by employing 
such protein or peptide in any known assay for cell 
chemotaxis . 

The activity of a protein of the invention may, among 
other means , be measured by the following methods : 

Assays for chemotactic activity ( which will identify 
proteins that induce or prevent chemotaxis ) consist of assays 



protein to induce the adhesion of one cell population to 
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another cell population. Suitable assays for movement and 
adhesion include, without limitation, those described in: 
Current Protocols in Immunology, Ed by J.E. \Coligan, A.M. 
Kruisbeek, D.H. Margulies, E.M. Shevach, W.Strober, Pub. 
Greene Publishing Associates and Wiley-Interscience (Chapter 
6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al . J. Clin. Invest. 95 : 1370-1376 , 
1995; Lind et al. APMIS 103:140-146 , 1995; Muller et al Eur. 
J. Immunol. 25 : 1744-1748 ; Gruber et al . J. of Immunol. 
152:5860-5867, 1994; Johnston et al. J. of Immunol. 153: 
1762-1768, 1994. 

Hemostatic and Thrombolytic Activity 

A protein of the invention may also exhibit hemostatic or 
thrombolytic activity. As a result, such a protein is 
expected to be useful in treatment of various coagulation 
disorders ( includinghereditary disorders , such as 
hemophilias) or to enhance coagulation and other hemostatic 
events in treating wounds resulting from trauma, surgery or 
other causes. A protein of the invention may also be useful 
for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom 
(such as, for example, infarction of cardiac and central 
nervous system vessels (e.g., stroke). 

The activity of a protein of the invention may, among 
other means, be measured by the following methods: 

Assay for hemostatic and thrombolytic activity include, 



Res. 45:413-41 9 , 1987 ; Humphrey et al . , Fibrinolysis 5:71-79 
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(1991); Schaub, Prostaglandins 35:467-474, 1988. 
Receptor/ Li gand Activity 

A protein of the present invention may also demonstrate 
activity as receptors, receptor ligands or inhibitors or 
agonists of receptor/ligand interactions. Examples of such 
receptors and ligands include, without limitation, cytokine 
receptors and their ligands, receptor kinases and their 
ligands , receptor phosphatases and their ligands , receptors 
involved in cell -cell interactions and their ligands 
( including without limitation , cellular adhesion molecules 
(such as selectins, integrins and their ligands) and 
receptor/ligand pairs involved in antigen presentation, 
antigen recognition and development of cellular and humoral 
immune responses). Receptors and ligands are also useful for 
screening of potential peptide or small molecule inhibitors 
of the relevant receptor/ligand interaction. A protein of 
the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful 
as inhibitors of receptor/ligand interactions. 

The activity of a protein of the invention may, among 
other means, be measured by the following methods: 

Suitable assays for receptor- li gand activity include 
without limitation those described in:Current Protocols in 
Immunology, Ed by J.E. Coligan, A.M. Kruisbeek, D.H. 
Margulies, E.M. Shevach, W.Strober, Pub. Greene Publishing 
Associates and Wiley-Interscience (Chapter 7.28, Measurement 



Bxerer et al . , J. Exp. Med. 16 8:1145-1156, 1988; Rosenstein 
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et al. , J. Exp. Med. 169:149-160 1989 ; Stoltenborg et 
al. , J. Immunol. Methods 175 : 59-68 , 1994 ; Stitt et al . , Cell 
80:661-670, 1995. 

Anti- Inf lamma tory Activity 

Proteins of the present invention may also exhibit 
anti-inflammatory activity. The anti-inflammatory activity 
may be achieved by providing a stimulus to cells involved in 
the inf lammatory response, by inhibiting or promoting 
cell-cell interactions (such as, for example, cell adhesion), 
by inhibiting or promoting chemotaxis of cells involved in 
the inflammatory process, inhibiting or promoting cell 
extravasation, or by stimulating or suppressing production of 
other factors which more directly inhibit or promote an 
inflammatory response. Proteins exhibiting such activities 
can be used to treat inflammatory conditions including 
chronic or acute conditions), including without limitation 
inf lamination associated with infection (such as septic shock, 
sepsis or systemic inflammatory response syndrome (SIRS)), 
ischemia-reperf usion injury, endotoxin lethality, arthritis, 
complement-mediated hyperacute rejection, nephritis, cytokine 
or chemokine-induced ^ un 9 injury, inflammatory bowel disease, 
Crohn's disease or resulting from over production of ytokines 
such as TNF or IL-1 . Proteins of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an 
antigenic substance or material . 

Tumor Inhibition Activity 



the invention may exhibit other anti-tumor activities. A 
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protein may inhibit tumor growth directly or indirectly (such 
as, for example, via ADCC ) . A protein may exhibit its tumor 
inhibitory activity by acting on tumor tissue or tumor 
precursor tissue, by inhibiting formation of tissues 
necessary to support tumor growth (such as, for example, by 
inhibiting angiogenesis ) , by causing production of other 
factors, agents or cell types which inhibit tumor growth, or 
by suppressing, eliminating or inhibiting factors, agents or 
cell types which promote tumor growth 
Other Activities 

A protein of the invention may also exhibit one or more 
of the following additional activities or effects: inhibiting 
the growth, infection or function of, or killing, infectious 
agents, including, without limitation, bacteria, viruses, 
fungi and other parasites; effecting (suppressing or 
enhancing) bodily characteristics, including, without 
limitation, height, weight, hair color, eye color, skin, fat 
to lean ratio or other tissue pigmentation, or organ or body 
part size or shape (such as, for example, breast augmentation 
or diminution, change in bone form or shape); effecting 
biorhythms or caricadic cycles or rhythms; effecting the 
fertility of male or female subjects; effecting the 
metabolism, catabolism, anabolism, processing, utilization, 
storage or elimination of dietary fat, lipid, protein, 
carbohydrate, vitamins, minerals, cof actors or other 
nutritional factors or component ( s ) ; effecting behavioral 



depression (including depressive disorders) and violent 
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behaviors; providing analgesic effects or other pain reducing 
effects; promoting differentiation and growth of embryonic 
stem cells in lineages other than hematopoietic lineages; 
hormonal or endocrine activity; in the case of enzymes, 
correcting deficiencies of the enzyme and treating 
deficiency-related diseases; treatment of hyperprolif erative 
disorders (such as, for example, psoriasis); 

immunoglobulin-like activity (such as, for example, the 
ability to bind antigens or complement ) ; and the ability to 
act as an antigen in a vaccine composition to raise an immune 
response against such protein or another material or entity 
which is cross-reactive with such protein. 
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SEQUENCE LISTING 

Sequence No. : 1 
Sequence length: 154 
Sequence type : Amino acid 
Topology : Linear 
Sequence kind : Protein 
Hypothetical : No 
Original source : 

Organism species : Homo sapiens 

Cell kind : Fibrosarcoma 

Cell line: HT-1080 

Clone name: HP00658 
Sequence description 

Met Lys Val Ser Ala Ala Ala Leu Ala Val lie Leu lie Ala Thr Ala 

15 10 15 

Leu Cys Ala Pro Ala Ser Ala Ser Pro Tyr Ser Ser Asp Thr Thr Pro 

20 25 30 

Cys Cys Phe Ala Tyr lie Ala Arg Pro Leu Pro Arg Ala His lie Lys 

35 40 45 

Glu Tyr Phe Tyr Thr Ser Gly Lys Cys Ser Asn Pro Ala Val Val His 

50 55 60 

Arg Ser Arg Met Pro Lys Arg Glu Gly Gin Gin Val Trp Gin Asp Phe 
65 70 75 80 

Leu Tyr Asp Ser Arg Leu Asn Lys Gly Lys Leu Cys His Pro Lys Glu 

85 90 95 



Gin Leu Phe Gly Asp Glu Leu Gly Trp Arg Val Leu Glu Pro Glu Leu 
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115 120 125 

Thr Gin lie Cys Leu Phe Leu Leu Ala Leu Val Leu Ala Trp Glu Ala 

130 135 140 

Ser Pro His Tyr Pro Thr Pro Pro Ala Pro 
145 150 



Sequence No . : 2 
Sequence length : 315 
Sequence type: Amino acid 
Topology : Linear 
Sequence kind : Protein 
Hypothetical : No 
Original source : 

Organism species : Homo sapiens 

Cell kind: Epidermoid carcinoma 

Cell line; KB 

Clone name: HP00714 
Sequence description 

Met Asp Leu Arg Gin Phe Leu Met Cys Leu Ser Leu Cys Thr Ala Phe 

15 10 15 

Ala Leu Ser Lys Pro Thr Glu Lys Lys Asp Arg Val His His Glu Pro 

20 25 30 

Gin Leu Ser Asp Lys Val His Asn Asp Ala Gin Ser Phe Asp Tyr Asp 

35 40 45 

His Asp Ala Phe Leu Gly Ala Glu Glu Ala Lys Thr Phe Asp Gin Leu 
50 55 60 



Asp Gly Asp Lys Asp Gly Phe Val Thr Val Asp Glu Leu Lys Asp Trp 
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85 90 95 

lie Lys Phe Ala Gin Lys Arg Trp lie Tyr Glu Asp Val Glu Arg Gin 

100 105 110 

Trp Lys Gly His Asp Leu Asn Glu Asp Gly Leu Val Ser Trp Glu Glu 

115 120 125 

Tyr Lys Asn Ala Thr Tyr Gly Tyr Val Leu Asp Asp Pro Asp Pro Asp 

130 135 140 

Asp Gly Phe Asn Tyr Lys Gin Met Met Val Arg Asp Glu Arg Arg Phe 
145 150 155 160 

Lys Met Ala Asp Lys Asp Gly Asp Leu lie Ala Thr Lys Glu Glu Phe 

165 170 175 

Thr Ala Phe Leu His Pro Glu Glu Tyr Asp Tyr Met Lys Asp lie Val 

180 185 190 

Val Gin Glu Thr Met Glu Asp lie Asp Lys Asn Ala Asp Gly Phe lie 

195 200 205 

Asp Leu Glu Glu Tyr He Gly Asp Met Tyr Ser His Asp Gly Asn Thr 

210 215 220 

Asp Glu Pro Glu Trp Val Lys Thr Glu Arg Glu Gin Phe Val Glu Phe 
225 230 235 240 

Arg Asp Lys Asn Arg Asp Gly Lys Met Asp Lys Glu Glu Thr Lys Asp 

245 250 255 

Trp He Leu Pro Ser Asp Tyr Asp His Ala Glu Ala Glu Ala Arg His 

260 265 270 

Leu Val Tyr Glu Ser Asp Gin Asn Lys Asp Gly Lys Leu Thr Lys Glu 

275 280 285 

Glu He Val Asp Lys Tyr Asp Leu Phe Val Gly Ser Gin Ala Thr Asp 



305 310 315 
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Sequence No . : 3 
Sequence length : 158 
Sequence type: Amino acid 
Topology: Linear 
Sequence kind : Protein 
Hypothetical: No 
Original source : 

Organism species : Homo sapiens 

Cell kind: Stomach cancer 

Clone name: HP00876 
Sequence description 

Met Ala Ser Arg Ser Met Arg Leu Leu Leu Leu Leu Ser Cys Leu Ala 

15 10 15 

Lys Thr Gly Val Leu Gly Asp lie lie Met Arg Pro Ser Cys Ala Pro 

20 25 30 

Gly Trp Phe Tyr His Lys Ser Asn Cys Tyr Gly Tyr Phe Arg Lys Leu 

35 AO 45 

Arg Asn Trp Ser Asp Ala Glu Leu Glu Cys Gin Ser Tyr Gly Asn Gly 

50 55 60 

Ala His Leu Ala Ser lie Leu Ser Leu Lys Glu Ala Ser Thr lie Ala 
65 70 75 80 

Glu Tyr lie Ser Gly Tyr Gin Arg Ser Gin Pro lie Trp lie Gly Leu 

85 90 95 

His Asp Pro Gin Lys Arg Gin Gin Trp Gin Trp lie Asp Gly Ala Met 

100 105 110 



Cys Ala Glu Met Ser Ser Asn Asn Asn Phe Leu Thr Trp Ser Ser Asn 
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130 135 140 

Glu Cys Asn Lys Arg Gin His Phe Leu Cys Lys Tyr Arg Pro 
145 150 155 



Sequence No.: 4 
Sequence length: 376 
Sequence type: Amino acid 
Topology: Linear 
Sequence kind: Protein 
Hypothetical : No 
Original source : 

Organism species : Homo sapiens 

Cell kind : Liver 

Clone name: HP01134 
Sequence description 

Met Val Trp Lys Val Ala Val Phe Leu Ser Val Ala Leu Gly lie Gly 

15 10 15 

Ala Val Pro lie Asp Asp Pro Glu Asp Gly Gly Lys His Trp Val Val 

20 25 30 

lie Val Ala Gly Ser Asn Gly Trp Tyr Asn Tyr Arg His Gin Ala Asp 

35 40 45 

Ala Cys His Ala Tyr Gin lie lie His Arg Asn Gly lie Pro Asp Glu 

50 55 60 

Gin lie Val Val Met Met Tyr Asp Asp He Ala Tyr Ser Glu Asp Asn 
65 70 75 80 

Pro Thr Pro Gly He Val He Asn Arg Pro Asn Gly Thr Asp Val Tyr 



100 



105 



110 
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Phe Leu Ala Val Leu Arg Gly Asp Ala Glu Ala Val Lys Gly lie Gly 

115 120 125 

Ser Gly Lys Val Leu Lys Ser Gly Pro Gin Asp His Val Phe lie Tyr 

130 135 140 

Phe Thr Asp His Gly Ser Thr Gly He Leu Val Phe Pro Asn Glu Asp 
145 150 155 160 

Leu His Val Lys Asp Leu Asn Glu Thr He His Tyr Met Tyr Lys His 

165 170 175 

Lys Met Tyr Arg Lys Met Val Phe Tyr He Glu Ala Cys Glu Ser Gly 

180 185 190 

Ser Met Met Asn His Leu Pro Asp Asn He Asn Val Tyr Ala Thr Thr 

195 200 205 

Ala Ala Asn Pro Arg Glu Ser Ser Tyr Ala Cys Tyr Tyr Asp Glu Lys 

210 215 220 

Arg Ser Thr Tyr Leu Gly Asp Trp Tyr Ser Val Asn Trp Met Glu Asp 
225 230 235 240 

Ser Asp Val Glu Asp Leu Thr Lys Glu Thr Leu His Lys Gin Tyr His 

245 250 255 

Leu Val Lys Ser His Thr Asn Thr Ser His Val Met Gin Tyr Gly Asn 

260 265 270 

Lys Thr He Ser Thr Met Lys Val Met Gin Phe Gin Gly Met Lys Arg 

275 280 285 

Lys Ala Ser Ser Pro Val Pro Leu Pro Pro Val Thr His Leu Asp Leu 

290 295 300 

Thr Pro Ser Pro Asp Val Pro Leu Thr He Met Lys Arg Lys Leu Met 
305 310 315 320 



Arg His Leu Asp Tyr Glu Tyr Ala Leu Arg His Leu Tyr Val Leu Val 
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340 345 350 

Asn Leu Cy s Glu Lys Pro Tyr Pro Leu His Arg lie Lys Leu Ser Met 

355 360 365 

Asp His Val Cys Leu Gly His Tyr 
370 375 



Sequence No . : 5 
Sequence length : 173 
Sequence type : Amino acid 
Topology : Linear 
Sequence kind: Protein 
Hypothetical: No 
Original source : 

Organism species: Homo sapiens 

Cell kind : Epidermoid carcinoma 

Cell line: KB 

Clone name: HP10029 
Sequence description 

Met Ala Ala Pro Ser Gly Gly Trp Asn Gly Val Arg Ala Ser Leu Trp 

15 10 15 

Ala Ala Leu Leu Leu Gly Ala Val Ala Leu Arg Pro Ala Glu Ala Val 

20 25 30 

Ser Glu Pro Thr Thr Val Ala Phe Asp Val Arg Pro Gly Gly Val Val 

35 40 45 

His Ser Phe Ser His Asn Val Gly Pro Gly Asp Lys Tyr Thr Cys Met 
50 55 60 



Leu Gly Thr Ser Glu Asp His Gin His Phe Thr Cys Thr Ilo Trp Arg 
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85 90 95 

Pro Gin Gly Lys Ser Tyr Leu Tyr Phe Thr Gin Phe Lys Ala Glu Val 

100 105 110 

Arg Gly Ala Glu lie Glu Tyr Ala Met Ala Tyr Ser Lys Ala Ala Phe 

115 120 125 

Glu Arg Glu Ser Asp Val Pro Leu Lys Thr Glu Glu Phe Glu Val Thr 

130 135 140 

Lys Thr Ala Val Ala His Arg Pro Gly Ala Phe Lys Ala Glu Leu Ser 
145 150 155 160 

Lys Leu Val lie Val Ala Lys Ala Ser Arg Thr Glu Leu 

165 170 



Sequence No.: 6 
Sequence length : 73 
Sequence type : Amino acid 
Topology : Linear 
Sequence kind: Protein 
Hypothetical: No 
Original source: 

Organism species : Homo sapiens 

Cell kind : Epidermoid carcinoma 

Cell line: KB 

Clone name: HP10189 
Sequence description 

Met Gly Val Lys Leu Glu lie Phe Arg Met lie lie Tyr Leu Thr Phe 
15 10 15 



Asp Val lie Gin Arg Lys Arg Glu Leu Trp Pro Pro Glu Lys Leu Gin 
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35 40 45 

Glu lie Glu Glu Phe Lys Glu Arg Leu Arg Lys Arg Arg Glu Glu Ly; 

50 55 60 

Leu Leu Arg Asp Ala Gin Gin Asn Ser 
65 70 



Sequence No. : 7 
Sequence length: 1172 
Sequence type: Amino acid 
Topology: Linear 
Sequence kind: Protein 
Hypothetical : No 
Original source : 

Organism species: Homo sapiens 

Cell kind: Histiocyte lymphoma 

Cell line: U937 

Clone name: HP10269 
Sequence description 

Met Arg Pro Phe Phe Leu Leu Cys Phe Ala Leu Pro Gly Leu Leu His 

15 10 15 

Ala Gin Gin Ala Cys Ser Arg Gly Ala Cys Tyr Pro Pro Val Gly Asp 

20 25 30 

Leu Leu Val Gly Arg Thr Arg Phe Leu Arg Ala Ser Ser Thr Cys Gly 

35 40 45 

Leu Thr Lys Pro Glu Thr Tyr Cys Thr Gin Tyr Gly Glu Trp Gin Met 
50 55 60 



Arg Val Glu Asn Val Ala Ser Ser Ser Gly Pro Met Arg Trp Trp Gin 
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85 90 95 

Ser Gin Asn Asp Val Asn Pro Val Ser Leu Gin Leu Asp Leu Asp Arg 

100 105 110 

Arg Phe Gin Leu Gin Glu Val Met Met Glu Phe Gin Gly Pro Met Pro 

115 120 125 

Ala Gly Met Leu lie Glu Arg Ser Ser Asp Phe Gly Lys Thr Trp Arg 

130 135 140 

Val Tyr Gin Tyr Leu Ala Ala Asp Cys Thr Ser Thr Phe Pro Arg Val 
145 150 155 160 

Arg Gin Gly Arg Pro Gin Ser Trp Gin Asp Val Arg Cys Gin Ser Leu 

165 170 175 

Pro Gin Arg Pro Asn Ala Arg Leu Asn Gly Gly Lys Val Gin Leu Asn 

180 185 190 

Leu Met Asp Leu Val Ser Gly lie Pro Ala Thr Gin Ser Gin Lys lie 

195 200 205 

Gin Glu Val Gly Glu lie Thr Asn Leu Arg Val Asn Phe Thr Arg Leu 

210 215 220 

Ala Pro Val Pro Gin Arg Gly Tyr His Pro Pro Ser Ala Tyr Tyr Ala 
225 230 235 240 

Val Ser Gin Leu Arg Leu Gin Gly Ser Cys Phe Cys His Gly His Ala 

245 250 255 

Asp Arg Cys Ala Pro Lys Pro Gly Ala Ser Ala Gly Pro Ser Thr Ala 

260 265 270 

Val Gin Val His Asp Val Cys Val Cys Gin His Asn Thr Ala Gly Pro 

275 280 285 

Asn Cys Glu Arg Cys Ala Pro Phe Tyr Asn Asn Arg Pro Trp Arg Pro 



305 310 315 320 
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His Ser Glu Thr Cys His 

325 

Gly Ala Tyr Gly Gly Val 

340 

Lys Asn Cys Glu Arg Cys 
355 

Gly Ala Ser lie Gin Glu 
370 

Gly Ala Val Pro Gly Ala 
385 390 
Cys Lys Glu His Val Gin 

405 

Phe Thr Gly Leu Thr Tyr 

420 

Cys Asn lie Leu Gly Ser 
435 

Gly Arg Cys Leu Cys Leu 
450 

Cys Ala Pro Tyr His Trp 
465 470 

Cys Ala Cys Asp Pro His 

485 

Thr Gly Gin Cys Pro Cys 

500 

Ala Ala Ala lie Arg Gin 
515 



74 

Phe Asp Pro Ala Val 

330 

Cys Asp Asn Cys Arg 
345 

Gin Leu His Tyr Phe 
360 

Thr Cys lie Ser Cys 
375 

Pro Cys Asp Pro Val 

395 

Gly Glu Arg Cys Asp 

410 

Ala Asn Pro Gin Gly 
425 

Arg Arg Asp Met Pro 
440 

Pro Asn Val Val Gly 
455 

Lys Leu Ala Ser Gly 

475 

Asn Ser Leu Ser Pro 

490 

Arg Glu Gly Phe Gly 
505 

Cys Pro Asp Arg Thr 
520 
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Phe Ala Ala Ser Gin 

335 

Asp His Thr Glu Gly 
350 

Arg Asn Arg Arg Pro 
365 

Glu Cys Asp Pro Asp 
380 

Thr Gly Gin Cys Val 

400 

Leu Cys Lys Pro Gly 

415 

Cys His Arg Cys Asp 
430 

Cys Asp Glu Glu Ser 
445 

Pro Lys Cys Asp Gin 
460 

Gin Gly Cys Glu Pro 

480 

Gin Cys Asn Gin Phe 

495 

Gly Leu Met Cys Ser 
510 

Tyr Gly Asp Val Ala 

525 



Gly Cys Asp Lys Ala Ser Gly Arg Cys Leu Cys Arg Pro Gly Leu Thr 





WO 98/11217 ^ PCT/JP97/03239 



75 

545 550 555 560 

Gly Pro Arg Cys Asp Gin Cys Gin Arg Gly Tyr Cys Asn Arg Tyr Pro 

565 570 575 

Val Cys Val Ala Cys His Pro Cys Phe Gin Thr Tyr Asp Ala Asp Leu 

580 585 590 

Arg Glu Gin Ala Leu Arg Phe Gly Arg Leu Arg Asn Ala Thr Ala Ser 

595 600 605 

Leu Trp Ser Gly Pro Gly Leu Glu Asp Arg Gly Leu Ala Ser Arg lie 

610 615 620 

Leu Asp Ala Lys Ser Lys lie Glu Gin lie Arg Ala Val Leu Ser Ser 
625 630 635 640 

Pro Ala Val Thr Glu Gin Glu Val Ala Gin Val Ala Ser Ala lie Leu 

645 650 655 

Ser Leu Arg Arg Thr Leu Gin Gly Leu Gin Leu Asp Leu Pro Leu Glu 

660 665 670 

Glu Glu Thr Leu Ser Leu Pro Arg Asp Leu Glu Ser Leu Asp Arg Ser 

675 680 685 

Phe Asn Gly Leu Leu Thr Met Tyr Gin Arg Lys Arg Glu Gin Phe Glu 

690 695 700 

Lys lie Ser Ser Ala Asp Pro Ser Gly Ala Phe Arg Met Leu Ser Thr 
705 710 715 720 

Ala Tyr Glu Gin Ser Ala Gin Ala Ala Gin Gin Val Ser Asp Ser Ser 

725 730 735 

Arg Leu Leu Asp Gin Leu Arg Asp Ser Arg Arg Glu Ala Glu Arg Leu 

740 745 750 

Val Arg Gin Ala Gly Gly Gly Gly Gly Thr Gly Ser Pro Lys Leu Val 



Mi" 



770 



775 



780 
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Asn Lys 
785 

Cys Pro 

Arg Cys 

Gly Gin 

Thr Arg 
850 
Ser Ser 
865 

Met Glu 

Asp Phe 

Ser Glu 

Leu Gin 
930 
Val Asp 
945 

Arg Leu 
Glu Gly 



Arg Leu 



17 

Leu Cys 
Gly Glu 

Arg Gly 
820 
Val Ala 
835 

Gin Met 

Ala Gin 

Glu Asp 

Leu Thr 
900 
Ala Val 
915 

Lys Met 
Leu Val 

Gin Ala 

Gin Val 
980 

He Gin 




Gly Asn 
790 
Leu Cys 
805 

Val Leu 

Glu Gin 

He Arg 

Arg Leu 
870 
Val Arg 
885 

Asp Pro 

Leu Ala 

Asn Glu 

Leu Ser 
950 
Glu Ala 
965 

Glu Asp 



Asp Arg 



Ser Arg 

Pro Gin 

Pro Arg 

Leu Arg 
840 
Ala Ala 
855 

Glu Thr 

Arg Thr 

Asp Thr 

Leu Trp 
920 
He Gin 
935 

Gin Thr 
Glu Glu 
Val Val 



Val Ala 



76 
Gin Met 

Asp Asn 
810 
Ala Gly 
825 

Gly Phe 
Glu Glu 
Gin Val 

Arg Leu 
890 
Asp Ala 
905 

Leu Pro 
Ala He 

Lys Gin 

Ala Arg 
970 
Gly Asn 
985 

Glu Val 



Ala Cys 
795 

Gly Thr 

Gly Ala 

Asn Ala 

Ser Ala 
860 
Ser Ala 
875 

Leu He 

Ala Thr 

Thr Asp 

Ala Ala 
940 

Asp He 
955 

Ser Arg 
Leu Arg 



Gin Gin 




Thr Pro 
Ala Cys 

Phe Leu 
830 
Gin Leu 
845 

Ser Gin 

Ser Arg 

Gin Gin 

He Gin 
910 
Ser Ala 
925 

Arg Leu 

Ala Arg 

Ala His 

Gin Gly 
990 

Val Leu 
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He Ser 
800 
Gly Ser 
815 

Met Ala 

Gin Arg 

He Gin 

Ser Gin 
880 
Val Arg 
895 

Glu Val 

Thr Val 

Pro Asn 

Ala Arg 
960 
Ala Val 
975 

Thr Val 



Arg Pro 
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1010 1015 1020 

Ala Glu Lys Leu Val Thr Ser Met Thr Lys Gin Leu Gly Asp Phe Trp 
1025 1030 1035 1040 

Thr Arg Met Glu Glu Leu Arg His Gin Ala Arg Gin Gin Gly Ala Glu 

1045 1050 1055 

Ala Val Gin Ala Gin Gin Leu Ala Glu Gly Ala Ser Glu Gin Ala Leu 

1060 1065 1070 

Ser Ala Gin Glu Gly Phe Glu Arg lie Lys Gin Lys Tyr Ala Glu Leu 

1075 1080 1085 

Lys Asp Arg Leu Gly Gin Ser Ser Met Leu Gly Glu Gin Gly Ala Arg 

1090 1095 1100 

lie Gin Ser Val Lys Thr Glu Ala Glu Glu Leu Phe Gly Glu Thr Met 
1105 1110 1115 1120 

Glu Met Met Asp Arg Met Lys Asp Met Glu Leu Glu Leu Leu Arg Gly 

1125 1130 1135 

Ser Gin Ala lie Met Leu Arg Ser Ala Asp Leu Thr Gly Leu Glu Lys 

1140 1145 1150 

Arg Val Glu Gin lie Arg Asp His lie Asn Gly Arg Val Leu Tyr Tyr 

1155 1160 1165 

Ala Thr Cys Lys 
1170 



Sequence No . : 8 
Sequence length : 122 
Sequence type : Amino acid 
Topology: Linear 



' V [HU fir i 

Original 



source : 
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Organism species: Homo sapiens 
Cell kind : Stomach cancer 
Clone name: HP10298 
Sequence description 

Met Gly Leu Leu Leu Leu Val Pro Leu Leu Leu Leu Pro Gly Ser Tyr 

15 10 15 

Gly Leu Pro Phe Tyr Asn Gly Phe Tyr Tyr Ser Asn Ser Ala Asn Asp 

20 25 30 

Gin Asn Leu Gly Asn Gly His Gly Lys Asp Leu Leu Asn Gly Val Lys 

35 40 45 

Leu Val Val Glu Thr Pro Glu Glu Thr Leu Phe Thr Arg He Leu Thr 

50 55 60 

Val Gly Pro Gin Ser Leu Gly Ser Glu Ala Leu Ala Ser Pro Thr Arg 
65 70 75 SO 

Arg Ala Ala Cys Thr Val Phe Thr Ala Thr Ala Ser Thr Arg Thr Trp 

85 90 95 

Gly Pro Pro Leu Pro His Ser Leu Thr Gly Cys Val Phe He Glu Trp 

100 105 110 

Phe Val Phe Pro Cys Gly Leu Glu Pro Phe 
115 120 

Sequence No. : 9 
Sequence length : 175 
Sequence type : Amino acid 
Topology : Linear 
Sequence kind : Protein 

: mi na .; ( ! s.i i c. ■ 1 : 
Organism species: Homo sapiens 
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Cell kind : Stomach cancer 
Clone name: HP10368 
Sequence description 

Met Glu Lys lie Pro Val Ser Ala Phe Leu Leu Leu Val Ala Leu Ser 

15 10 15 

Tyr Thr Leu Ala Arg Asp Thr Thr Val Lys Pro Gly Ala Lys Lys Asp 

20 25 30 

Thr Lys Asp Ser Arg Pro Lys Leu Pro Gin Thr Leu Ser Arg Gly Trp 

35 AO 45 

Gly Asp Gin Leu lie Trp Thr Gin Thr Tyr Glu Glu Ala Leu Tyr Lys 

50 55 60 

Ser Lys Thr Ser Asn Lys Pro Leu Met lie lie His His Leu Asp Glu 
65 70 75 80 

Cys Pro His Ser Gin Ala Leu Lys Lys Val Phe Ala Glu Asn Lys Glu 

85 90 95 

lie Gin Lys Leu Ala Glu Gin Phe Val Leu Leu Asn Leu Val Tyr Glu 

100 105 110 

Thr Thr Asp Lys His Leu Ser Pro Asp Gly Gin Tyr Val Pro Arg lie 

115 120 125 

Met Phe Val Asp Pro Ser Leu Thr Val Arg Ala Asp lie Thr Gly Arg 

130 135 140 

Tyr Ser Asn Arg Leu Tyr Ala Tyr Glu Pro Ala Asp Thr Ala Leu Leu 
145 150 155 160 

Leu Asp Asn Met Lys Lys Ala Leu Lys Leu Leu Lys Thr Glu Leu 

165 170 175 



Sequence type : Nucle 



ic: acid 
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Strandedness : Double 
Topology: Linear 
Sequence kind: cDNA to mRNA 
Original source : 

Organism species : Homo sapiens 

Cell kind : Fibrosarcoma 

Cell line: HT-1080 

Clone name: HP00658 
Sequence description 

ATGAAGGTCT CCGCGGCAGC CCTCGCTGTC ATCCTCATTG CTACTGCCCT CTGCGCTCCT 6 0 

GCATCTGCCT CCCCATATTC CTCGGACACC ACACCCTGCT GCTTTGCCTA CATTGCCCGC 120 
CCACTGCCCC GTGCCCACAT CAAGGAGTAT TTCTACACCA GTGGCAAGTG CTCCAACCCA 180 
GCAGTCGTCC ACAGG TCAAG GATGCCAAAG AGAGAGGGAC AGCAAGTCTG GCAGGATTTC 240 
CTGTATGACT CCCGGCTGAA CAAGGGCAAG CTTTGTCACC CGAAAGAACC GCCAAGTGTG 300 
TGCCAACCCA GAGAAGAAAT GGGTTCGGGA GTACATCAAC TCTTTGGAGA TGAGCTAGGA 360 
TGGAGAGTCC TTGAACCTGA ACTTACACAA ATTTGCCTGT TTCTGCTTGC TCTTGTCCTA 420 
GCTTGGGAGG CTTCCCCTCA CTATCCTACC CCACCCGCTC CT 462 

Sequence No.: 11 
Sequence length: 945 
Sequence type: Nucleic acid 

Strandedness : Double 
Topology : Linear 
Sequence kind : cDNA to mRNA 
Original source: 



K .1. I.UI 




Cell line: KB 
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Clone name: BP00714 
Sequence description 

ATGGACCTGC GACAGTTTCT TATGTGCCTG TCCCTGTGCA CAGCCTTTGC CTTGAGCAAA 60 

CCCACAGAAA AGAAGGACCG TGTACATCAT GAGCCTCAGC TCAGTGACAA GGTTCACAAT 120 

GATGCTCAGA GTTTTGATTA TGACCATGAT GCCTTCTTGG GTGCTGAAGA AGCAAAGACC 180 

TTTGATCAGC TGACACCAGA AG AG AG C AAG GAAAGGCTTG GAAAGATTGT AAGTAAAATA 240 

GATGGCGACA AGGACGGGTT TGTCACTGTG GATGAGCTCA AAGACTGGAT TAAATTTGCA 300 

CAAAAGCGCT GGATTTACGA GGATG TAG AG CGACAGTGGA AGGGGCATGA CCTCAATGAG 360 

GACGGCCTCG TTTCCTGGGA GGAG TAT AAA AATGCCACCT ACGGCTACGT TTTAGATGAT 420 

CCAGATCCTG ATGATGGATT TAAC TAT AAA CAGATGATGG TTAGAGATGA GCGGAGGTTT 480 

AAAATGG C AG ACAAGGATGG AGACCTCATT GCCACCAAGG AGGAGTTCAC AGCTTTCCTG 540 

CACCCTGAGG AGTATGACTA CATGAAAGAT AT AG TAG T AC AGGAAACAAT GGAAGATATA 600 

GATAAGAATG CTGATGGTTT CATTGATCTA GAAGAGTATA TTGGTGACAT GTACAGCCAT 660 

GATGGGAATA CTGATGAGCC AGAATGGGTA AAG AC AG AG C GAGAGCAGTT TGTTGAGTTT 720 

CGGGATAAGA ACCGTGATGG GAAGATGGAC AAGGAAGAGA CCAAAGACTG GATCCTTCCC 780 

TCAGACTATG ATCATGCAGA GGCAGAAGCC AGGCACCTGG TCTATGAATC AGACCAAAAC 840 

AAGGATGGCA AG C T TACCAA GGAGGAGATC GTTGACAAGT ATGACTTATT TGTTGGCAGC 900 

CAGG CCACAG ATTTTGGGGA GGCCTTAGTA CGGCATGATG AGTTC 945 

Sequence No. : 12 

Sequence length : 4 74 

Sequence type : Nucleic acid 

Strandedness : Double 

Topology : Linear 

Sequence kind: cDNA to mRNA 

Original source : 

IT';'. : i >nia t ; , a i it t ; : 
Clone name: HP00876 
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Sequence description 

ATGGCTTCCA GAAGCATGCG GCTGCTCCTA TTGCTGAGCT GCCTGGCCAA AACAGGAGTC 60 

CTGGGTGATA TCATCATGAG ACCCAGCTGT GCTCCTGGAT GGTTTTACCA CAAG TCCAAT 120 

TGCTATGGTT ACTTCAGGAA GCTGAGGAAC TGGTCTGATG CCGAGCTCGA GTGTCAGTCT 180 

TACGGAAACG GAGCCCACCT GGCATCTATC CTGAGTTTAA AGGAAGCCAG CACCATAGCA 240 

GAGTACATAA GTGGCTATCA GAGAAGCCAG CCGATATGGA TTGGCCTGCA CGACCCACAG 300 

AAGAGGCAGC AGTGGCAGTG GATTGATGGG GCCATGTATC TGTACAGATC CTGGTCTGGC 360 

AAGTCCATGG G TGGG AACAA GCACTGTGCT GAGATGAGCT CCAATAACAA CTTTTTAACT 420 

TGGAGCAGCA ACGAATGCAA CAAGCGCCAA CACTTCCTGT GCAAGTACCG ACCA 4 74 



Sequence No.: 13 

Sequence length: 1128 

Sequence type : Nucleic acid 

Strandedness : Double 

Topology : Linear 

Sequence kind: cDNA to mRNA 

Original source : 

Organism species: Homo sapiens 

Cell kind: Liver 

Clone name: HP01134 
Sequence description 

ATGGTTTGGA AAGTAGCTGT ATTCCTCAGT GTGGCCCTGG GCATTGGTGC CGTTCCTATA 60 
GATGATCCTG AAGATGGAGG CAAGCACTGG GTGGTGATCG TGGCAGGTTC AAATGGCTGG 120 
TATAATTATA GGCACCAGGC AGACGCGTGC CATGCCTACC AGATCATTCA CCGCAATGGG 180 
ATTCCTGACG AACAGATCGT TGTGATGATG TACGATGACA TTGCTTACTC TGAAGACAAT 240 
CCCACTCCAG GAATTGTGAT CAACAGGCCC AATGGCACAG ATGTCTATCA GGGAGTCCCG 300 



,CAGAAGCAo TGAAGGGCA . AGGATC'.GGG ' : AAAGTCCTGA AGAGTGCGG. f 'A^GATr.Ai 
GTGTTCATTT ACTTCACTGA CCATGGATCT ACTGGAATAC TGGTTTTTCC CAATGAAGAT 480 
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CTTCATGTAA AGGACCTGAA T GAG AC CATC CATTACATGT ACAAACACAA AATGTACCGA 540 

AAGATGGTGT TCTACATTGA AGCCTGTGAG TCTGGGTCCA TGATGAACCA CCTGCCGGAT 600 

AACATCAATG TTTATGCAAC TACTGCTGCC AACCCCAGAG AGTCGTCCTA CGCCTGTTAC 660 

TATGATGAGA AGAGGTCCAC GTACCTGGGG GACTGGTACA GCGTCAACTG GATGGAAGAC 720 

TCGGACGTGG AAGATCTGAC T AAAG AG AC C CTGCACAAGC AGTACCACCT GGTAAAATCG 780 

CACACCAACA CCAGCCACGT CATGCAGTAT GGAAACAAAA CAATCTCCAC CATGAAAGTG 840 

ATGCAGTTTC AGGG TATGAA ACGCAAAGCC AGTTCTCCCG TCCCCCTACC TCCAGTCACA 900 

CACCTTGACC TCACCCCCAG CCCTGATGTG CCTCTCACCA TCATGAAAAG GAAACTGATG 960 

AACACCAATG ATCTGGAGGA GTCCAGGCAG CTCACGGAGG AGATCCAGCG GCATCTGGAT 1020 

TACGAGTATG CGTTGAGACA TTTGTACGTG CTGGTCAACC TTTGTGAGAA GCCGTATCCG 1080 

CTTCACAGGA TAAAATTGTC CATGGACCAC GTGTGCCTTG GTCACTAC 1128 

Sequence No. : 14 

Sequence length : 519 

Sequence type : Nucleic acid 

Strandedness : Double 

Topology: Linear 

Sequence kind: cDNA to mRNA 

Original source : 

Organism species : Homo sapiens 

Cell kind : Epidermoid carcinoma 

Cell line: KB 

Clone name: HP10029 
Sequence description 

ATGGCGGCGC CCAGCGGAGG GTGGAACGGC GTCCGCGCGA GCTTGTGGGC CGCGCTGCTC 60 
CTAGGGGCCG TGGCGCTGAG GCCGGCGGAG GCGGTGTCCG AGCCCACGAC CGTGGCGTTT 120 

: atac(:t(;ta tgttcactta u.;c:c;tot(;aa i ,v,agggacua a igaggaat. - ; .(;a(,at(.;a(; ■ 
CTGGGGACCA GCGAAGACCA CCAGCACTTC ACCTGCACCA TCTGGAGGCC CCAGGGGAAG 300 
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TCCTATCTGT ACTTCACACA GTTGAAGGCA GAGGTGCGGG GCGCTGAGAT TGAGTACGCC 



360 



ATGGCCTACT CTAAAGCCGC ATTTGAAAGG GAAAGTGATG TCCCTCTGAA AACTGAGGAA 



420 



TTTGAAGTGA CCAAAACAGC AGTGGCTCAC AGGCCCGGGG CATTCAAAGC TGAGCTGTCC 



480 



AAGCTGGTGA TTGTGGCCAA GGCATCGCGC ACTGAGCTG 



519 



Sequence No.: 15 

Sequence length: 219 

Sequence type : Nucleic acid 

Strandedness : Double 

Topology : Linear 

Sequence kind : cDNA to mRNA 

Original source: 

Organism species: Homo sapiens 

Cell kind: Epidermoid carcinoma 

Cell line: KB 

Clone name: HP10189 
Sequence description 

ATGGGGGTGA AG C TGG AG AT ATTTCGGATG ATAATCTACC TCACTTTCCC TGTGGCTATG 60 
TTCTGGGTTT CCAATCAGGC CGAGTGGTTT GAGGACGATG TCATACAGCG CAAGAGGGAG 120 
CTGTGGCCAC CTGAGAAGCT TCAAGAGATA GAGGAATTCA AAGAGAGGTT ACGGAAGCGG 180 
CGGGAGGAGA AGCTCCTTCG CGACGCCCAG CAGAACTCC 219 

Sequence No. : 16 
Sequence length : 3516 
Sequence type: Nucleic acid 
Strandedness : Double 



m K N A 



Original source : 
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Organism species: Homo sapiens 
Cell kind : Lymphoma 
Cell line: U937 
Clone name: HP10269 
Sequence description 

ATGAGACCAT TCTTCCTCTT GTGTTTTGCC CTGCCTGGCC TCCTGCATGC CCAACAAGCC 60 

TGCTCCCGTG GGGCCTGCTA TCCACCTGTT GGGGACCTGC TTGTTGGGAG GACCCGGTTT 120 

CTCCGAGCTT CATCTACCTG TGGACTGACC AAGCCTGAGA CCTACTGCAC CCAGTATGGC 180 

GAGTGGCAGA TGAAATGCTG CAAGTGTGAC TCCAGGCAGC CTCACAACTA CTACAGTCAC 240 

CGAG TAGAGA ATGTGGCTTC ATCCTCCGGC CCCATGCGCT GGTGGCAGTC CCAGAATGAT 300 

GTGAACCCTG TCTCTCTGCA GCTGGACCTG GACAGGAGAT TCCAGCTTCA AGAAG TCATG 360 

ATGGAGTTCC AGGGGCCCAT GCCTGCCGGC ATGCTGATTG AGCGCTCCTC AGACTTCGGT 420 

AAGACCTGGC GAGTGTACCA GTACCTGGCT GCCGACTGCA CCTCCACCTT CCCTCGGGTC 480 

CGCCAGGGTC GGCCTCAGAG CTGGCAGGAT GTTCGGTGCC AGTCCCTGCC TCAGAGGCCT 540 

AATGCACGCC TAAATGGGGG GAAGGTCCAA CTTAACCTTA TGGATTTAGT GTCTGGGATT 600 

CCAGCAACTC AAAGTCAAAA AATTCAAGAG GTGGGGGAGA TCACAAACTT GAGAG TCAAT 660 

TTCACCAGGC TGGCCCCTGT GCCCCAAAGG GGCTACCACC CTCCCAGCGC CTACTATGCT 720 

GTGTCCCAGC TCCGTCTGCA GGGGAGCTGC TTCTGTCACG GCCATGCTGA TCGCTGCGCA 780 

CCCAAGCCTG GGGCCTCTGC AGGCCCCTCC ACCGCTGTGC AGGTCCACGA TGTCTGTGTC 84 0 

TGCCAGCACA ACACTGCCGG CCCAAATTGT GAGCGCTGTG CACCCTTCTA CAACAACCGG 900 

CCCTGGAGAC CGGCGGAGGG CCAGGACGCC CATGAATGCC AAAGG TGCG A CTGCAATGGG 960 

CACTCAGAGA CATGTCACTT TGACCCCGCT GTGTTTGCCG CCAGCCAGGG GGCATATGGA 1020 

GGTGTGTGTG ACAATTGCCG GGACCACACC GAAGGCAAGA ACTGTGAGCG GTGTCAGCTG 1080 

CACTATTTCC GGAACCGGCG CCCGGGAGCT TCCATTCAGG AGACCTGCAT CTCCTGCGAG 1140 

TGTGATCCGG ATGGGGCAGT GCCAGGGGCT CCCTGTGACC CAGTGACCGG GCAGTGTGTG 1200 

TGCAAGGAGC ATGTGCAGGG AGAGCGCTGT GACCTATGCA AGCCGGGCTT CACTGGACTC 1260 

•AC:A7<^;r(^ ;:tc;acga<^a ■•a; ;t( ;c;r;r :: : : ■ -•;( :::"ttct- : - ; ;c ;caa^v : : •• .wT;;(;(;t;;(. 

AAATGTGACC AGTGTGCTCC CTACCACTGG AAGCTGGCCA GTGGCCAGGG CTGTGAACCG 1440 
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TGTGCCTGCG ACCCGCACAA CTCCCTCAGC CCACAGTGCA ACCAGTTCAC AGGGCAGTGC 1500 

CCCTGTCGGG AAGGCTTTGG TGGCCTGATG TGCAGCGCTG CAGCCATCCG CCAGTGTCCA 1560 

GACCGGACCT ATGGAGACGT GGCCACAGGA TGCCGAGCCT GTGACTGTGA TTTCCGGGGA 1620 

ACAGAGGGCC CGGGCTGCGA CAAGGCATCA GGCCGCTGCC TCTGCCGCCC TGGCTTGACC 1680 

GGGCCCCGCT GTGACCAGTG CCAGCGAGGC TACTGCAATC GCTACCCGGT GTGCGTGGCC 1740 

TGCCACCCTT GCTTCCAGAC CTATGATGCG GACCTCCGGG AGCAGGCCCT GCGCTTTGGT 1800 

AGACTCCGCA ATGCCACCGC CAGCCTGTGG TCAGGGCCTG GGCTGGAGGA CCGTGGCCTG 1860 

GCCTCCCGGA TCCTAGATGC AAAG AG TAAG ATTGAGCAGA TCCGAGCAGT TCTCAGCAGC 1920 

CCCGCAGTCA C AG AG C AGG A GGTGGCTCAG GTGGCCAGTG CCATCCTCTC CCTCAGGCGA 1980 

AGTCTCCAGG GCCTGCAGCT GGATCTGCCC CTGGAGGAGG AGACGTTGTC CCTTCCGAGA 2040 

GACCTGGAGA G T C T TG AC AG AAGCTTCAAT GGTCTCCTTA CTATGTATCA GAGGAAGAGG 2100 

GAGCAGTTTG AAAAAATAAG CAGTGCTGAT CCTTCAGGAG CCTTCCGGAT GCTGAGCACA 2160 

GCCTACGAGC AGTCAGCCCA GGCTGCTCAG CAGGTCTCCG ACAGCTCGCG CCTTTTGGAC 2220 

CAGCTCAGGG ACAGCCGGAG AGAGGCAGAG AGGCTGGTGC GGCAGGCGGG AGGAGGAGGA 2280 

GGCACCGGCA GCCCCAAGCT TGTGGCCCTG AGGCTGGAGA TGTCTTCGTT GCCTGACCTG 2340 

ACACCCACCT TCAACAAGCT CTGTGGCAAC TCCAGGCAGA TGGCTTGCAC CCCAATATCA 2400 

TGCCCTGGTG AGCTATGTCC CCAAGACAAT GGCACAGCCT GTGGCTCCCG CTGCAGGGGT 2460 

GTCCTTCCCA GGGCCGGTGG GGCCTTCTTG ATGGCGGGGC AGGTGGCTGA GCAGCTGCGG 2520 

GGCTTCAATG CCCAGCTCCA GCGGACCAGG CAGATGATTA GGGCAGCCGA GGAATCTGCC 2580 

TCACAGATTC AATCCAGTGC CCAGCGCTTG GAGACCCAGG TGAGCGCCAG CCGCTCCCAG 2640 

ATGGAGGAAG ATGTCAGACG CACACGGCTC CTAATCCAGC AGGTCCGGGA CTTCCTAACA 2700 

GACCCCGACA CTGATGCAGC CACTATCCAG GAGGTCAGCG AGGCCGTGCT GGCCCTGTGG 2760 

CTGCCCACAG ACTCAGCTAC TGTTCTGCAG AAGATGAATG AGATCCAGGC CATTGCAGCC 2820 

AGGCTCCCCA ACGTGGACTT GGTGCTGTCC CAGACCAAGC AGGACATTGC GCGTGCCCGC 2880 

CGGTTGCAGG CTGAGGCTGA GGAAGCCAGG AGCCGAGCCC ATGCAGTGGA GGGCCAGGTG 2940 

GAAGATGTGG TTGGGAACCT GCGGCAGGGG ACAGTGGCAC TGCAGGAAGC TCAGGACACC 3000 

■ tactgcgg;: ;;ac;cagaaaa ■ ;{;tggtc:aca a at^a; a a ( ,i :A(;r.Tr;f;.' ; rcACTTrvrc; 

ACACGGATGG AGGAGCTCCG CCACCAAGCC CGGCAGCAGG GGGCAGAGGC AGTCCAGGCG 318 0 
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CAGCAGCTTG CGGAAGGTGC CAGCGAGCAG GCATTGAGTG CCCAAGAGGG ATTTGAGAGA 3240 

ATAAAACAAA AGTATGCTGA GTTGAAGGAC CGGTTGGGTC AGAGTTCCAT GCTGGGTGAG 3300 

CAGGGTGCCC GGATCCAGAG TGTGAAGACA GAGGCAGAGG AGCTGTTTGG GGAGACCATG 3360 

GAGATGATGG ACAGGATGAA AGACATGGAG TTGGAGCTGC TGCGGGGCAG CCAGGCCATC 3420 

ATGCTGCGCT CAGCGGACCT GACAGGACTG GAGAAGCGTG TGGAGCAGAT CCGTGACCAC 3480 

ATCAATGGGC GCGTGCTCTA CTATGCCACC TGCAAG 3516 

Sequence No . : 17 

Sequence length: 366 

spnuence tvne : Nucleic acid 

Strandedness : Double 

Topology: Linear 

Sequence kind: cDNA to mRNA 

Original source : 

Organism species: Homo sapiens 

Cell kind : Stomach cancer 

Clone name: HP10298 
Sequence description 

ATGGGCCTGT TGCTCCTGGT CCCATTGCTC CTGCTGCCCG GCTCCTACGG ACTGCCCTTC 60 
TACAACGGCT TCTACTACTC CAACAGCGCC AACGACCAGA ACCTAGGCAA CGGTCATGGC 120 
AAAGACCTCC TTAATGGAGT GAAGCTGGTG GTGGAGACAC CCGAGGAGAC CCTGTTCACC 18 0 

CGCATCCTAA CTGTGGGCCC CCAGAGCCTG GGGTCCGAAG CTTTGGCTTC CCCGACCCGC 240 
AGAGCCGCTT GTACGGTGTT TACTGCTACC GCCAGCACTA GGACCTGGGG CCCTCCCCTG 300 
CCGCATTCCC TCACTGGCTG TGTATTTATT GAGTGGTTCG TTTTCCCTTG TGGGTTGGAG 360 
CCATTT 366 



Sequence type : Nucleic acid 
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Strandedness : Double 
Topology: Linear 
Sequence kind; cDNA to mRNA 
Original source : 

Organism species: Homo sapiens 

Cell kind: Stomach cancer 

Clone name: HP10368 
Sequence description 

ATGGAGAAAA TTCCAGTGTC AGCATTCTTG CTCCTTGTGG CCCTCTCCTA CACTCTGGCC 60 
AGAGATACCA CAG TCAAACC TGGAGCCAAA AAGGACACAA AGGACTCTCG ACCCAAACTG 120 
CCCCAGACCC TCTCCAGAGG TTGGGGTGAC CAACTCATCT GGACTCAGAC ATATGAAGAA 180 
GCTCTATATA AATCCAAGAC AAGCAACAAA CCCTTGATGA TTATTCATCA CTTGGATGAG 240 
TGCCCACACA GTCAAGCTTT AAAGAAAGTG TTTGCTGAAA ATAAAGAAAT CCAGAAATTG 300 
GCAGAGCAGT TTGTCCTCCT CAATCTGGTT TATGAAACAA CTGACAAACA CCTTTCTCCT 360 
GATGGCCAGT ATGTCCCCAG GATTATGTTT GTTGACCCAT CTCTGACAGT TAGAGCCGAT 420 
ATCACTGGAA GATATTCAAA CCGTCTCTAT GCTTACGAAC CTGCAGATAC AGCTCTGTTG 480 
CTTGACAACA TGAAGAAAGC TCTCAAGTTG CTGAAGACTG AATTG 525 

Sequence No. : 19 

Sequence length: 1296 

Sequence type: Nucleic acid 

Strandedness : Double 

Topology: Linear 

Sequence kind: cDNA to mRNA 

Original source: 

Organism species: Homo sapiens 

Clone name: HP00658 
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Sequence characteristics: 

Code representing characteristics: CDS 

Existence site: 56.. 520 

Characterization method: E 
Sequence description 

CCTGCAGAGG ATCAAGACAG CACGTGGACC TCGCACAGCC TCTCCCACAG GTACC ATG 58 

Met 



AAG GTC TCC GCG GCA GCC CTC GCT GTC ATC CTC ATT GCT ACT GCC CTC 106 

Lys Val Ser Ala Ala Ala Leu Ala Val lie Leu lie Ala Thr Ala Leu 

5 10 15 

TGC GCT CCT GCA TCT GCC TCC CCA TAT TCC TCG GAC ACC ACA CCC TGC 154 

Cys Ala Pro Ala Ser Ala Ser Pro Tyr Ser Ser Asp Thr Thr Pro Cys 

20 25 30 

TGC TTT GCC TAC ATT GCC CGC CCA CTG CCC CGT GCC CAC ATC AAG GAG 202 

Cys Phe Ala Tyr lie Ala Arg Pro Leu Pro Arg Ala His lie Lys Glu 

35 40 45 

TAT TTC TAC ACC AGT GGC AAG TGC TCC AAC CCA GCA GTC GTC CAC AGG 250 

Tyr Phe Tyr Thr Ser Gly Lys Cys Ser Asn Pro Ala Val Val His Arg 

50 55 60 65 

TCA AGG ATG CCA AAG AGA GAG GGA CAG CAA GTC TGG CAG GAT TTC CTG 298 

Ser Arg Met Pro Lys Arg Glu Gly Gin Gin Val Trp Gin Asp Phe Leu 

70 75 80 

TAT GAC TCC CGG CTG AAC AAG GGC AAG CTT TGT CAC CCG AAA GAA CCG 346 

Tyr Asp Ser Arg Leu Asn Lys Gly Lys Leu Cys His Pro Lys Glu Pro 

85 90 95 



Mi 



100 



105 



110 
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CTC TTT GGA GAT GAG CTA GGA TGG AGA GTC CTT GAA CCT GAA CTT ACA 442 
Leu Phe Gly Asp Glu Leu Gly Trp Arg Val Leu Glu Pro Glu Leu Thr 

115 120 125 

CAA ATT TGC CTG TTT CTG CTT GCT CTT GTC CTA GCT TGG GAG GCT TCC 4 90 
Gin lie Cys Leu Phe Leu Leu Ala Leu Val Leu Ala Trp Glu Ala Ser 
130 135 140 145 

CCT CAC TAT CCT ACC CCA CCC GCT CCT TGAAGGGCCC AGA 530 
Pro His Tyr Pro Thr Pro Pro Ala Pro 

150 

TTCTACCACA CAGCAGCAGT T AC AAAAAC C TTCCCCAGGC TGGACGTGGT GGCTCACGCC 590 

TGTAATCCCA GCACTTTGGG AGGCCAAGGT GGGTGGATCA CTTGAGGTCA GGAGTTCGAG 650 

ACCAGCCTGG CCAACATGAT GAAACCCCAT CTCTACTAAA AATACAAAAA ATTAGCCGGG 710 

CGTGGTAGCG GGCGCCTGTA GTCCCAGCTA CTCGGGAGGC TGAGGCAGGA GAATGGCGTG 770 

AACCCGGGAG GCGGAGCTTG CAGTGAGCCG AGATCGCGCC ACTGCACTCC AGCCTGGGCG 830 

ACAGAGCGAG ACTCCGTCTC AAAAAAAAAA AAAAAAAAAA AAATACAAAA ATTAGCCGGG 890 

CGTGGTGGCC CACGCCTGTA ATCCCAGCTA CTCGGGAGGC TAAGGCAGGA AAATTGTTTG 950 

AACCCAGGAG GTGGAGGCTG CAGTGAGCTG AGATTGTGCC ACTTCACTCC AGCCTGGGTG 1010 

ACAAAGTGAG ACTCCGTCAC AACAACAACA ACAAAAAGCT TCCCCAACTA AAGCCTAGAA 1070 

GAGCTTCTGA GGCGCTGCTT TGTCAAAAGG AAGTCTCTAG GTTCTGAGCT CTGGCTTTGC 1130 

CTTGGCTTTG CCAGGGCTCT GTGACCAGGA AGGAAG TCAG CATGCCTCTA GAGGCAAGGA 1190 

GGGGAGGAAC GCTGCACTCT TAAGCTTCCG CCGTCTCAAC CCCTCACAGG AGCTTACTGG 1250 

CAAACATGAA AAATCGGCTT ACCATTAAAG TTCTCAATGC AACCAT 1296 



Sequence No . : 20 
Sequence length : 3 311 
Sequence type: Nucleic acid 

< > T h > i ( ) p v ■ n f * ; i ■ 

Sequence kind: cDNA to mRNA 
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Original source: 

Organism species: Homo sapiens 
Cell kind : Epidermoid carcinoma 
Cell line: KB 
Clone name: HP00714 
Sequence characteristics : 
Code representing characteristics : CDS 
Existence site: 57. . 1004 
Characterization method: E 
Sequence description 

GAGCGGCGGC CACGGCATCC TGTGCTGTGG GGGCTACGAG GAAAGATCTA ATTATC ATG 59 

Met 



GAC CTG CGA CAG TTT CTT ATG TGC CTG TCC CTG TGC ACA GCC TTT GCC 10 7 

Asp Leu Arg Gin Phe Leu Met Cys Leu Ser Leu Cys Thr Ala Phe Ala 

5 10 15 

TTG AGC AAA CCC ACA GAA AAG AAG GAC CGT GTA CAT CAT GAG CCT CAG 155 

Leu Ser Lys Pro Thr Glu Lys Lys Asp Arg Val His His Glu Pro Gin 

20 25 30 

CTC AGT GAC AAG GTT CAC AAT GAT GCT CAG AGT TTT GAT TAT GAC CAT 203 

Leu Ser Asp Lys Val His Asn Asp Ala Gin Ser Phe Asp Tyr Asp His 

35 40 45 

GAT GCC TTC TTG GGT GCT GAA GAA GCA AAG ACC TTT GAT CAG CTG ACA 251 

Asp Ala Phe Leu Gly Ala Glu Glu Ala Lys Thr Phe Asp Gin Leu Thr 

50 55 60 65 

CCA GAA GAG AGC AAG GAA AGG CTT GGA AAG ATT GTA AGT AAA ATA GAT 299 



GGC GAC AAG GAC GGG TTT GTC ACT GTG GAT GAG CTC AAA GAC TGG ATT 



34 7 
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Gly Asp Lys Asp Gly Phe Val Thr Val Asp Glu Leu Lys Asp Trp lie 

85 90 95 

AAA TTT GCA CAA AAG CGC TGG ATT TAG GAG GAT GTA GAG CGA CAG TGG 395 
Lys Phe Ala Gin Lys Arg Trp lie Tyr Glu Asp Val Glu Arg Gin Trp 

100 105 110 

AAG GGG CAT GAC CTC AAT GAG GAC GGC CTC GTT TCC TGG GAG GAG TAT 443 
Lys Gly His Asp Leu Asn Glu Asp Gly Leu Val Ser Trp Glu Glu Tyr 

115 120 125 

AAA AAT GCC ACC TAC GGC TAC GTT TTA GAT GAT CCA GAT CCT GAT GAT 4 91 

Lys Asn Ala Thr Tyr Gly Tyr Val Leu Asp Asp Pro Asp Pro Asp Asp 
130 135 140 145 

GGA TTT AAC TAT AAA CAG ATG ATG GTT AGA GAT GAG CGG AGG TTT AAA 5 39 

Gly Phe Asn Tyr Lys Gin Met Met Val Arg Asp Glu Arg Arg Phe Lys 

150 155 160 

ATG GCA GAC AAG GAT GGA GAC CTC ATT GCC ACC AAG GAG GAG TTC ACA 587 
Met Ala Asp Lys Asp Gly Asp Leu lie Ala Thr Lys Glu Glu Phe Thr 

165 170 175 

GCT TTC CTG CAC CCT GAG GAG TAT GAC TAC ATG AAA GAT ATA GTA GTA 6 35 

Ala Phe Leu His Pro Glu Glu Tyr Asp Tyr Met Lys Asp lie Val Val 

180 185 190 

CAG GAA ACA ATG GAA GAT ATA GAT AAG AAT GCT GAT GGT TTC ATT GAT 68 3 

Gin Glu Thr Met Glu Asp lie Asp Lys Asn Ala Asp Gly Phe lie Asp 

195 200 205 

CTA GAA GAG TAT ATT GGT GAC ATG TAC AGC CAT GAT GGG AAT ACT GAT 7 31 

Leu Glu Glu Tyr lie Gly Asp Met Tyr Ser His Asp Gly Asn Thr Asp 
210 215 220 225 



t ! < v , ■ ■ . . . r ! u ■ A I 



230 



23b 



240 
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GAT AAG AAC CGT GAT GGG AAG ATG GAC AAG GAA GAG ACC AAA GAC TGG 82 7 

Asp Lys Asn Arg Asp Gly Lys Met Asp Lys Glu Glu Thr Lys Asp Trp 

245 250 255 

ATC CTT CCC TCA GAC TAT GAT CAT GCA GAG GCA GAA GCC AGG CAC CTG 875 
lie Leu Pro Ser Asp Tyr Asp His Ala Glu Ala Glu Ala Arg His Leu 

260 265 270 

GTC TAT GAA TCA GAC CAA AAC AAG GAT GGC AAG CTT ACC AAG GAG GAG 923 
Val Tyr Glu Ser Asp Gin Asn Lys Asp Gly Lys Leu Thr Lys Glu Glu 

275 280 285 

ATC GTT GAC AAG TAT GAC TTA TTT GTT GGC AGC CAG GCC ACA GAT TTT 971 
lie Val Asp Lys Tyr Asp Leu Phe Val Gly Ser Gin Ala Thr Asp Phe 
290 295 300 305 

GGG GAG GCC TTA GTA CGG CAT GAT GAG TTC TGAGCTACGG AGGAACCCT 1020 
Gly Glu Ala Leu Val Arg His Asp Glu Phe 

310 315 
CATTTCCTCA AAAGTAATTT ATTTTTACAG CTTCTGGTTT CACATGAAAT TGTTTGCGCT 1080 
ACTGAGACTG T T AC TAC AAA CTTTTTAAGA CATGAAAAGG CG TAATGAAA ACCATCCCGT 1140 
CCCCATTCCT CCTCCTCTCT GAGGGACTGG AGGGAAGCCG TGCTTCTGAG GAACAACTCT 1200 
AATTAGTACA CTTGTGTTTG TAGATTTACA CTTTGTATTA TGTATTAACA TGGCGTGTTT 1260 
ATTTTTGTAT TTTTCTCTGG TTGGGAGTAT GATATGAAGG ATCAAGATCC TCAAC TCACA 1320 
C ATG TAG AC A AACATTAGCT CTTTACTCTT TCTCAACCCC TTTTATGATT TTAATAATTC 1380 
TCACTTAACT AATTTTGTAA GCCTGAGATC AATAAGAAAT GTTCAGGAGA GAGGAAAGAA 1440 
AAAAAATATA TGCTCCACAA T TTA TAT TTA GAGAGAGAAC ACTTAGTCTT GCCTGTCAAA 1500 
AAGTCCAACA TTTCATAGGT AGTAGGGGCC ACATATTACA TTCAGTTGCT ATAGGTCCAG 1560 
CAACTGAACC TGCCATTACC TGGGCAAGGA AAGATCCCTT TGCTCTAGGA AAGCTTGGCC 1620 
CAAATTGATT TTCTTCTTTT TCCCCCTGTA GGACTGACTG TTGGCTAATT TTGTCAAGCA 1680 
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TGCCTTTTGA AATCACTGTA AATGCCCCCA TCCGGTTCCT CTTCTTCCCA GGTGTGCCAA 1920 

GGAATTAATC TTGGTTTCAC TACAAT T AAA ATTCACTCCT TTCCAATCAT GTCATTGAAA 1980 

GTGCCTTTAA CGAAAGAAAT GGTCACTGAA TGGGAATTCT CTTAAGAAAC CCTGAGATTA 2040 

AAAAAAGACT ATTTGGATAA CTTATAGGAA AGCCTAGAAC CTCCCAGTAG AGTGGGGATT 2100 

TTTTTCTTCT TCCCTTTCTC TTTTGGACAA TAGTTAAATT AGCAGTATTA GTTATGAGTT 2160 

TGGTTGCAGT GTTCTTATCT TGTGGGCTGA TTTCCAAAAA CCACATGCTG CTGAATTTAC 2220 

CAGGGATCCT CATACCTCAC AATGCAAACC ACTTACTACC AGGCCTTTTT CTGTGTCCAC 2280 

TGGAGAGCTT GAGCTCACAC TCAAAGATCA GAGGACCTAC AG AG AGG G C T CTTTGGTTTG 2340 

AGGACCATGG CTTACCTTTC CTGCCTTTGA CCCATCACAC CCCATTTCCT CCTCTTTCCC 2400 

TCTCCCCGCT GCCAAAAAAA AAAAAAAAAG GAAACGTTTA TCATGAATCA ACAGGGTTTC 24 60 

AGTCCTTATC AAAGAGAGAT GTGGAAAGAG CTAAAGAAAC CACCCTTTGT TCCCAACTCC 2520 

ACTTTACCCA TATTTTATGC AACACAAACA CTGTCCTTTT GGGTCCCTTT CTTACAGATG 2580 

GACCTCTTGA GAAGAAT TAT CGTATTCCAC GTTTTTAGCC CTCAGGTTAC CAAGATAAAT 2640 

ATATGTATAT ATAACCTTTA TTATTGCTAT ATCTTTGTGG ATAATACATT CAGGTGGTGC 2700 

TGGGTGATTT ATTATAATCT GAACCTAGGT ATATCCTTTG GTCTTCCACA GTCATGTTGA 2760 

GGTGGGCTCC CTGGTATGGT AAAAAG C C AG GTATAATGTA ACTTCACCCC AGCCTTTGTA 2820 

CTAAGCTCTT GATAGTGGAT ATACTCTTTT AAGTTTAGCC CCAATATAGG GTAATGGAAA 2880 

TTTCCTGCCC TCTGGGTTCC CCATTTTTAC TAT T AAGAAG ACC AG TGATA ATTTAATAAT 294 0 

GCCACCAACT CTGGCTTAGT TAAGTGAGAG TGTGAACTGT GTGGCAAGAG AGCCTCACAC 3000 

CTCACTAGGT G C AG AG AG C C CAGGCCTTAT GTTAAAATCA TGCACTTGAA AAGCAAACCT 3060 

TAATCTGCAA AGACAGCAGC AAGCATTATA CGGTCATCTT GAATGATCCC TTTGAAATTT 3120 

TTTTTTTGTT TGTTTGTTTA AATCAAGCCT GAGGCTGGTG AACAGTAGCT ACACACCCAT 3180 

ATTGTGTGTT CTGTGAATGC TAGCTTTCTT GAATTTGGAT ATTGGTTATT TTTTATAGAG 324 0 

TG TAAACCAA GTTTTATATT CTGCAATGCG AACAGGTACC TATCTGTTTC TAAATAAAAC 3300 

TGTTTACATT C 3 311 
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Strandedness : Double 
Topology: Linear 
Sequence kind: cDNA to mRJNA 
Original source: 

Organism species: Homo sapiens 
Cell kind: Stomach cancer 
Clone name: HP00876 
Sequence characteristics: 
Code representing characteristics: CDS 
Existence site: 147.. 623 
Characterization method: E 
Sequence description 

AC TGGAGAC A CTGAAGAAGG CAGGGGCCCT TAGAGTCTTG GTTGCCAAAC AGATTTGCAG 60 
AT C AAGG AG A ACCCAGGAGT TTCAAAGAAG CGCTAGTAAG GTCTCTGAGA TCCTTGCACT 120 
AGCTACATCC TCAGGGTAGG AGGAAG ATG GCT TCC AGA AGC ATG CGG CTG CTC 173 

Met Ala Ser Arg Ser Met Arg Leu Leu 



CTA TTG CTG AGC TGC CTG GCC AAA ACA GGA GTC CTG GGT GAT ATC ATC 221 

Leu Leu Leu Ser Cys Leu Ala Lys Thr Gly Val Leu Gly Asp lie lie 

10 15 20 25 

ATG AGA CCC AGC TGT GCT CCT GGA TGG TTT TAC CAC AAG TCC AAT TGC 269 

Met Arg Pro Ser Cys Ala Pro Gly Trp Phe Tyr His Lys Ser Asn Cys 

30 35 40 

TAT GGT TAC TTC AGG AAG CTG AGG AAC TGG TCT GAT GCC GAG CTC GAG 317 

Tyr Gly Tyr Phe Arg Lys Leu Arg Asn Trp Ser Asp Ala Glu Leu Glu 

45 50 55 



60 



65 



70 
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AAG GAA GCC AGC ACC ATA GCA GAG TAC ATA AGT GGC TAT GAG AGA AGC 413 
Lys Glu Ala Ser Thr lie Ala Glu Tyr lie Ser Gly Tyr Gin Arg Ser 

75 80 85 

CAG CCG ATA TGG ATT GGC CTG CAC GAC CCA CAG AAG AGG CAG CAG TGG 4 61 

Gin Pro lie Trp lie Gly Leu His Asp Pro Gin Lys Arg Gin Gin Trp 
90 95 100 105 

CAG TGG ATT GAT GGG GCC ATG TAT CTG TAC AGA TCC TGG TCT GGC AAG 509 
Gin Trp He Asp Gly Ala Met Tyr Leu Tyr Arg Ser Trp Ser Gly Lys 

110 115 120 

TCC ATG GGT GGG AAC AAG CAC TGT GCT GAG ATG AGC TCC AAT AAC AAC 557 
Ser Met Gly Gly Asn Lys His Cys Ala Glu Met Ser Ser Asn Asn Asn 

125 130 135 

TTT TTA ACT TGG AGC AGC AAC GAA TGC AAC AAG CGC CAA CAC TTC CTG 605 
Phe Leu Thr Trp Ser Ser Asn Glu Cys Asn Lys Arg Gin His Phe Leu 

140 145 150 

TGC AAG TAC CGA CCA TAGAGCAAGA ATCAAGATTC TGCTAACTCC 650 
Cys Lys Tyr Arg Pro 
155 

TGCACAGCCC CGTCCTCTTC CTTTCTGCTA GCCTGGCTAA ATCTGCTCAT TATTTCAGAG 710 
GGG AAACC T A GCAAACTAAG AG TGAT AAGG GCCCTACTAC ACTGGCTTTT TTAGGCTTAG 770 
AGACAGAAAC TTTAGCATTG GCC CAG TAG T GGCTTCTAGC TCTAAATGTT TGCCCCGCCA 8 30 

TCCCTTTCCA CAGTATCCTT CTTCCCTCCT CCCCTGTCTC TGGCTGTCTC GAGCAGTCTA 890 
G AAG AG TG C A TCTCCAGCCT ATGAAACAGC TGGGTCTTTG GCCATAAGAA GTAAAGATTT 950 
GAAGACAGAA GGAAGAAACT CAGG AG T AAG CTTCTAGCCC CCTTCAGCTT CTACACCCTT 1010 
CTGCCCTCTC TCCATTGCCT GCACCCCACC CCAGCCACTC AACTCCTGCT TGTTTTTCCT 1070 
TTGGCCATGG GAAGGTTTAC CAGTAGAATC CTTGCTAGGT TGATGTGGGC CATACATTCC 1130 



Sequence No. : 22 
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Sequence length : 1749 
Sequence type : Nucleic acid 
Strandedness : Double 
Topology : Linear 
Sequence kind: cDNA to mRNA 
Original source : 

Organism species : Homo sapiens 
Cell kind : Liver 
Clone name: HP01134 
Sequence characteristics : 
Code representing characteristics: CDS 
Existence site: 117.. 1247 
Characterization method: E 
Sequence description 

AAT C AC AG C A GTNCCGACGT CGTGGGTGTT TGGTGTGAGG CTGCGAGCCG CCGCCGCCAC 60 
CACTGCCACC ACGGTCGCCT GCCACAGGTG TCTGCAATTG AACTCCAAGG TGCAGA ATG 119 

Met 



GTT TGG AAA GTA GCT GTA TTC CTC AGT GTG GCC CTG GGC ATT GGT GCC 167 
Val Trp Lys Val Ala Val Phe Leu Ser Val Ala Leu Gly lie Gly Ala 

5 10 15 

GTT CCT ATA GAT GAT CCT GAA GAT GGA GGC AAG CAC TGG GTG GTG ATC 215 
Val Pro lie Asp Asp Pro Glu Asp Gly Gly Lys His Trp Val Val lie 

20 25 30 

GTG GCA GGT TCA AAT GGC TGG TAT AAT TAT AGG CAC CAG GCA GAC GCG 26 3 

Val Ala Gly Ser Asn Gly Trp Tyr Asn Tyr Arg His Gin Ala Asp Ala 



Cys His Ala Tyr Gin Tie lie His Arg Asn Gly lie Pro Asp Glu Gin 
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50 

ATC GTT GTG 
lie Val Val 

ACT CCA GGA 
Thr Pro Gly 

GGA GTC CCG 
Gly Val Pro 
100 

CTT GCT GTG 
Leu Ala Val 

115 
GGC AAA GTC 
Gly Lys Val 
130 

ACT GAC CAT 
Thr Asp His 

CAT GTA AAG 

His Val Lys 

ATG TAC CGA 
Met Tyr Arg 
180 

ATG ATG AAC 

GCC AAC CCC 




55 

ATG ATG TAC GAT GAC 
Met Met Tyr Asp Asp 
70 

ATT GTG ATC AAC AGG 
lie Val lie Asn Arg 
85 

AAG GAC TAC ACT GGA 
Lys Asp Tyr Thr Gly 

105 

TTG AGA GGC GAT GCA 
Leu Arg Gly Asp Ala 

120 

CTG AAG AGT GGC CCC 
Leu Lys Ser Gly Pro 
135 

GGA TCT ACT GGA ATA 
Gly Ser Thr Gly lie 
150 

GAC CTG AAT GAG ACC 
Asp Leu Asn Glu Thr 
165 

AAG ATG GTG TTC TAC 
Lys Met Val Phe Tyr 

185 

CAC CTG CCG GAT AAC 

AGA GAG TCG TCC TAC 




98 

60 

ATT GCT TAC TCT GAA 
lie Ala Tyr Ser Glu 
75 

CCC AAT GGC ACA GAT 
Pro Asn Gly Thr Asp 
90 

GAG GAT GTT ACC CCA 
Glu Asp Val Thr Pro 

110 

GAA GCA GTG AAG GGC 
Glu Ala Val Lys Gly 

125 

CAG GAT CAC GTG TTC 
Gin Asp His Val Phe 
140 

CTG GTT TTT CCC AAT 
Leu Val Phe Pro Asn 
155 

ATC CAT TAC ATG TAC 
lie His Tyr Met Tyr 
170 

ATT GAA GCC TGT GAG 
lie Glu Ala Cys Glu 

190 

ATC AAT GTT TAT GCA 

GCC TGT TAC TAT GAT 
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GAC AAT CCC 359 
Asp Asn Pro 
80 

GTC TAT CAG 407 
Val Tyr Gin 
95 

CAA AAT TTC 455 
Gin Asn Phe 

ATA GGA TCC 503 
lie Gly Ser 

ATT TAC TTC 551 
lie Tyr Phe 
145 

GAA GAT CTT 599 
Glu Asp Leu 
160 

AAA CAC AAA 64 7 

Lys His Lys 
175 

TCT GGG TCC 695 
Ser Gly Ser 

ACT ACT GCT 74 3 

GAG AAG AGG 791 
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Ala Asn Pro Arg Glu Ser Ser Tyr Ala Cys Tyr Tyr Asp Glu Lys Arg 

210 215 220 225 

TCC ACG TAC CTG GGG GAC TGG TAC AGC GTC AAC TGG ATG GAA GAC TCG 839 

Ser Thr Tyr Leu Gly Asp Trp Tyr Ser Val Asn Trp Met Glu Asp Ser 

230 235 240 

GAC GTG GAA GAT CTG ACT AAA GAG ACC CTG CAC AAG CAG TAC CAC CTG 88 7 

Asp Val Glu Asp Leu Thr Lys Glu Thr Leu His Lys Gin Tyr His Leu 

245 250 255 

GTA AAA TCG CAC ACC AAC ACC AGC CAC GTC ATG CAG TAT GGA AAC AAA 935 
Val Lys Ser His Thr Asn Thr Ser His Val Met Gin Tyr Gly Asn Lys 

260 265 270 

ACA ATC TCC ACC ATG AAA GTG ATG CAG TTT CAG GGT ATG AAA CGC AAA 983 
Thr lie Ser Thr Met Lys Val Met Gin Phe Gin Gly Met Lys Arg Lys 

275 280 285 

GCC AGT TCT CCC GTC CCC CTA CCT CCA GTC ACA CAC CTT GAC CTC ACC 1031 
Ala Ser Ser Pro Val Pro Leu Pro Pro Val Thr His Leu Asp Leu Thr 
290 295 300 305 

CCC AGC CCT GAT GTG CCT CTC ACC ATC ATG AAA AGG AAA CTG ATG AAC 1079 
Pro Ser Pro Asp Val Pro Leu Thr lie Met Lys Arg Lys Leu Met Asn 

310 315 320 

ACC AAT GAT CTG GAG GAG TCC AGG CAG CTC ACG GAG GAG ATC CAG CGG 1127 
Thr Asn Asp Leu Glu Glu Ser Arg Gin Leu Thr Glu Glu lie Gin Arg 

325 330 335 

CAT CTG GAT TAC GAG TAT GCG TTG AGA CAT TTG TAC GTG CTG GTC AAC 1175 
His Leu Asp Tyr Glu Tyr Ala Leu Arg His Leu Tyr Val Leu Val Asn 
340 345 350 



355 



360 



36 b 
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CAC GTG TGC CTT GGT CAC TAG TGAAGAGCTG CCTCCTGGAA GCTTTT 



1270 



His Val Cys Leu Gly His Tyr 



370 



375 



CCAAGTGTGA GCGCCCCACC GACTGTGTGC TGAT CAGAGA CTGGAGAGGT GGAG TGAGAA 1330 

GTCTCCGCTG CTCGGGCCCT CCTGGGGAGC CCCCGCTCCA GGGCTCGCTC CAGGACCTTC 1390 

TTCACAAGAT GACTTGCTCG CTGTTACCTG CTTCCCCAGT CTTTTCTGAA AAAC T AC AAA 1450 

TTAGGGTGGG AAAAGCTCTG TATTGAGAAG GGTCATATTT GCTTTCTAGG AGGTTTGTTG 1510 

TTTTGCCTGT TAGTTTTGAG GAGCAGGAAG CTCATGGGGG CTTCTGTAGC CCCTCTCAAA 1570 

AGGAGTCTTT ATTC TGAGAA TTTGAAGCTG AAACCTCTTT AAATCTTCAG AATGATTTTA 1630 

TTGAAGAGGG CCGCAAGCCC CAAATGGAAA ACTGTTTTTA GAAAATATGA TGATTTTTGA 1690 

TTGCTTTTGT ATTTAATTCT GCAGGTGTTC AAGTCTTAAA AAA T AAAG A T TTATAACAG 174 9 

Sequence No . : 23 

Sequence length : 988 

Sequence type: Nucleic acid 

Strandedness : Double 

Topology : Linear 

Sequence kind: cDNA to mRNA 

Original source : 

Organism species: Homo sapiens 

Cell kind: Epidermoid carcinoma 

Cell line: KB 

Clone name: HP10029 
Sequence characteristics : 
Code representing characteristics : CDS 
Existence site: 9. . 530 



AGTCCAAC ATG GCG GCG CCC AGC GGA GGG TGG AAC GGC GTC CGC GCG AGC 
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Met Ala Ala Pro Ser Gly Gly Trp Asn Gly Val Arg Ala Ser 

15 10 

TTG TGG GCC GCG CTG CTC CTA GGG GCC GTG GCG CTG AGG CCG GCG GAG 98 

Leu Trp Ala Ala Leu Leu Leu Gly Ala Val Ala Leu Arg Pro Ala Glu 

15 20 25 30 

GCG GTG TCC GAG CCC ACG ACC GTG GCG TTT GAC GTG CGG CCC GGC GGC 146 

Ala Val Ser Glu Pro Thr Thr Val Ala Phe Asp Val Arg Pro Gly Gly 

35 40 45 

GTC GTG CAT TCC TTC TCC CAT AAC GTG GGC CCG GGG GAC AAA TAT ACG 194 

Vfll Val His Ser Phe Ser His Asn Val Gly Pro Gly Asp Lys Tyr Thr 

50 55 60 

TGT ATG TTC ACT TAG GCC TCT CAA GGA GGG ACC AAT GAG CAA TGG CAG 242 

Cys Met Phe Thr Tyr Ala Ser Gin Gly Gly Thr Asn Glu Gin Trp Gin 

65 70 75 

ATG AGT CTG GGG ACC AGC GAA GAC CAC CAG CAC TTC ACC TGC ACC ATC 290 

Met Ser Leu Gly Thr Ser Glu Asp His Gin His Phe Thr Cys Thr lie 

80 85 90 

TGG AGG CCC CAG GGG AAG TCC TAT CTG TAC TTC ACA CAG TTC AAG GCA 338 

Trp Arg Pro Gin Gly Lys Ser Tyr Leu Tyr Phe Thr Gin Phe Lys Ala 

95 100 105 110 

GAG GTG CGG GGC GCT GAG ATT GAG TAC GCC ATG GCC TAC TCT AAA GCC 386 

Glu Val Arg Gly Ala Glu lie Glu Tyr Ala Met Ala Tyr Ser Lys Ala 

115 120 125 

GCA TTT GAA AGG GAA AGT GAT GTC CCT CTG AAA ACT GAG GAA TTT GAA 4 34 

Ala Phe Glu Arg Glu Ser Asp Val Pro Leu Lys Thr Glu Glu Phe Glu 

130 135 140 



145 



150 



155 
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CTG TCC AAG CTG GTG ATT GTG GCC AAG GCA TCG CGC ACT GAG CTG 527 
Leu Ser Lys Leu Val lie Val Ala Lys Ala Ser Arg Thr Glu Leu 

160 165 170 

TGA CCAGCAGCCC TGTTGCGGGT GGCACCTTCT CATCTCCGGT G AAG C TG AAG 580 

GGGCCTGTGG CCCTGAAAGG GCCAGCACAT CACTGGTTTT CTAGGAGGGA CTCTTAAGTT 640 

TTCTACCTGG GCTGACGTTG CCTTGTCCGG AGGGGCTTGC AGGGTGGCTG AAGCCCTGGG 700 

G C AG AG AAC A GAGGGTCCAG GGCCCTCCTG GCTCCCAACA GCTTCTCAGT TCCCACTTCC 760 

TGCTGAGCTC TTCTGGACTC AGGATCGCAG ATCCGGGGCA CAAAGAGGGT GGGGAACATG 820 

GGGGCTATGC TGGGGAAAGC AGCCATGCTC CCCCCGACCT CCAGCCGAGC ATCCTTCATG 880 

AGCCTGCAGA ACTGCTTTCC TATGTTTACC CAGGGGACCT CCTTTCAGAT GAACTGGGAA 940 

GAGATGAAAT GTTTTTTCAT ATTTAAATAA ATAAGAACAT TAAAAAGC 988 

Sequence No.: 24 

Sequence length: 390 

Sequence type: Nucleic acid 

Strandedness : Double 

Topology : Linear 

Sequence kind: cDNA to mRNA 

Original source: 

Organism species: Homo sapiens 

Cell kind: Epidermoid carcinoma 

Cell line: KB 

Clone name: HP10189 
Sequence characteristics : 
Code representing characteristics : CDS 
Existence site: 102.. 323 

AATCAGCTTC AGCAATGGAG CGTGCAAAAC ACCAGTGAGC TTCTGTCTTG CTGGAGGGTC 6 0 
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GGCTTTGGGC GGAACTGGCT TTGTTGACCG GGAGAAACGA G ATG GGG GTG AAG CTG 116 

Met Gly Val Lys Leu 
1 5 

GAG ATA TTT CGG ATG ATA ATC TAC CTC ACT TTC CCT GTG GCT ATG TTC 164 
Glu He Phe Arg Met He He Tyr Leu Thr Phe Pro Val Ala Met Phe 

10 15 20 

TGG GTT TCC AAT CAG GCC GAG TGG TTT GAG GAC GAT GTC ATA CAG CGC 212 
Trp Val Ser Asn Gin Ala Glu Trp Phe Glu Asp Asp Val lie Gin Arg 

25 30 35 

AAG AGG GAG CTG TGG CCA CCT GAG AAG CTT CAA GAG ATA GAG GAA TTC 260 
Lys Arg Glu Leu Trp Pro Pro Glu Lys Leu Gin Glu He Glu Glu Phe 

AO 45 50 

AAA GAG AGG TTA CGG AAG CGG CGG GAG GAG AAG CTC CTT CGC GAC GCC 308 
Lys Glu Arg Leu Arg Lys Arg Arg Glu Glu Lys Leu Leu Arg Asp Ala 

55 60 65 

CAG CAG AAC TCC TGAGGCCTCC AAG TGGGAG T CCTAGCCCCT 350 
Gin Gin Asn Ser 
70 

CCCCTGATGA AATATACATA TACTCAGTTC CTTGTTATTC 390 

Sequence No . : 25 

Sequence length : 4667 

Sequence type: Nucleic acid 

Strandednes s : Double 

Topology: Linear 

Sequence kind : cDNA to mRNA 
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Cell kind: Lymphoma 
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Cell line: U937 
Clone name: HP10269 
Sequence characteristics: 
Code representing characteristics: CDS 
Existence site: 754.. 4272 
Characterization method : E 
Sequence description 

CATTTAGTTA CTCTGCTCAT TTCTCTTAAG CTTTCCTTGG ATGAGTTGAG CTTTGAATCC 60 
TTCCTGATGA ACCTTGCCTT TTAAGGATCC TCCAAATGCC CCAAGAAGCT GGGATTTTTC 120 
ATTTTTTTTT TCACTGGGGA GGGGAATGGT GCTTTCCAGG GTCCTGGATG TTTGAGTCTT 180 
CTCACCTTCC AGCCCGGTGA TATGTCTGGA GCTTTAACTC TCTATATAAG CCCTAATCTT 24 0 

TGTGTTCTCT GCCTGATCTT CTGTCTGGGG TGGTCCAGGT CACAAGAAGA AGCTGACCCC 300 
TGCTGGCTTT GGGAAAATGC TGAGTTCATT GCCTGGCACA AATGCAAGGG CCCTTCCCCA 360 
CCCTGTGAAT TCTGGTCTCT GATGATCACT TACATGTGCC TTGTGCTTTC TGTTTGAGGG 420 
GCCCCTTGCA GCCCCCACAG GCAGGTGGGC ATTGTGGAGC TCACTACAAG AACTCTGGGA 480 
CCGACCGACC AACCCACTTG CCCAGTCCCG TCC TGGGAGG TGGGGGTGCA GTGACGACAG 540 
ATGGGTGTGA CGGCTGCCAG ATTCCTGAGA CCCGCCCTGC GGTGGGGCTA CACCCAGCCA 600 
GGGAGTCTCC AGAGGTGAGG CTGTTGTTTA AAAACCTGGA GCCGGGAGGG GAGACCCCCA 660 
CATTCAAGAG GAGCTTTCAG GCGATCTGGA GAAAGAACGG CAGAACACAC AGCAAGGAAA 720 
GGTCCTTTCT GGGGATCACC CCATTGGCTG AAG ATG AGA CCA TTC TTC CTC TTG 774 

Met Arg Pro Phe Phe Leu Leu 
1 5 

TGT TTT GCC CTG CCT GGC CTC CTG CAT GCC CAA CAA GCC TGC TCC CGT 822 
Cys Phe Ala Leu Pro Gly Leu Leu His Ala Gin Gin Ala Cys Ser Arg 

10 15 20 

GGG GCC TGC TAT CCA CCT GTT GGG GAC CTG CTT GTT GGG AGG ACC CGG 870 



TTT CTC CGA GCT TCA TCT ACC TGT GGA CTG ACC AAG CCT GAG ACC TAC 



918 





WO 98/11217 , W r w PCT/JP97/03239 



105 

Phe Leu Arg Ala Ser Ser Thr Cys Gly Leu Thr Lys Pro Glu Thr Tyr 
40 45 50 55 

TGC ACC CAG TAT GGC GAG TGG CAG ATG AAA TGC TGC AAG TGT GAC TCC 966 
Cys Thr Gin Tyr Gly Glu Trp Gin Met Lys Cys Cys Lys Cys Asp Ser 

60 65 70 

AGG CAG CCT CAC AAC TAC TAC AGT CAC CGA GTA GAG AAT GTG GCT TCA 1014 
Arg Gin Pro His Asn Tyr Tyr Ser His Arg Val Glu Asn Val Ala Ser 

75 80 85 

TCC TCC GGC CCC ATG CGC TGG TGG CAG TCC CAG AAT GAT GTG AAC CCT 1062 
Ser Ser Gly Pro Met Arg Trp Trp Gin Ser Gin Asn Asp Val Asn Pro 

90 95 100 

GTC TCT CTG CAG CTG GAC CTG GAC AGG AGA TTC CAG CTT CAA GAA GTC 1110 
Val Ser Leu Gin Leu Asp Leu Asp Arg Arg Phe Gin Leu Gin Glu Val 

105 110 115 

ATG ATG GAG TTC CAG GGG CCC ATG CCT GCC GGC ATG CTG ATT GAG CGC 1158 
Met Met Glu Phe Gin Gly Pro Met Pro Ala Gly Met Leu lie Glu Arg 
120 125 130 135 

TCC TCA GAC TTC GGT AAG ACC TGG CGA GTG TAC CAG TAC CTG GCT GCC 1206 
Ser Ser Asp Phe Gly Lys Thr Trp Arg Val Tyr Gin Tyr Leu Ala Ala 

140 145 150 

GAC TGC ACC TCC ACC TTC CCT CGG GTC CGC CAG GGT CGG CCT CAG AGC 1254 
Asp Cys Thr Ser Thr Phe Pro Arg Val Arg Gin Gly Arg Pro Gin Ser 

155 160 165 

TGG CAG GAT GTT CGG TGC CAG TCC CTG CCT CAG AGG CCT AAT GCA CGC 1302 
Trp Gin Asp Val Arg Cys Gin Ser Leu Pro Gin Arg Pro Asn Ala Arg 
170 175 180 



185 



190 



195 
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ATT CCA GCA ACT CAA AGT CAA AAA ATT CAA GAG GTG GGG GAG ATC ACA 1398 

lie Pro Ala Thr Gin Ser Gin Lys lie Gin Glu Val Gly Glu lie Thr 

200 205 210 215 

AAC TTG AGA GTC AAT TTC ACC AGG CTG GCC CCT GTG CCC CAA AGG GGC 144 6 

Asn Leu Arg Val Asn Phe Thr Arg Leu Ala Pro Val Pro Gin Arg Gly 

220 225 230 

TAC CAC CCT CCC AGC GCC TAC TAT GCT GTG TCC CAG CTC CGT CTG CAG 14 94 

Tyr His Pro Pro Ser Ala Tyr Tyr Ala Val Ser Gin Leu Arg Leu Gin 

235 240 245 

GGG AGC TGC TTC TGT CAC GGC CAT GCT GAT CGC TGC GCA CCC AAG CCT 1542 
Gly Ser Cys Phe Cys His Gly His Ala Asp Arg Cys Ala Pro Lys Pro 

250 255 260 

GGG GCC TCT GCA GGC CCC TCC ACC GCT GTG CAG GTC CAC GAT GTC TGT 1590 
Gly Ala Ser Ala Gly Pro Ser Thr Ala Val Gin Val His Asp Val Cys 

265 270 275 

GTC TGC CAG CAC AAC ACT GCC GGC CCA AAT TGT GAG CGC TGT GCA CCC 1638 
Val Cys Gin His Asn Thr Ala Gly Pro Asn Cys Glu Arg Cys Ala Pro 
280 285 290 295 

TTC TAC AAC AAC CGG CCC TGG AGA CCG GCG GAG GGC CAG GAC GCC CAT 1686 
Phe Tyr Asn Asn Arg Pro Trp Arg Pro Ala Glu Gly Gin Asp Ala His 

300 305 310 

GAA TGC CAA AGG TGC GAC TGC AAT GGG CAC TCA GAG ACA TGT CAC TTT 1734 
Glu Cys Gin Arg Cys Asp Cys Asn Gly His Ser Glu Thr Cys His Phe 

315 320 325 

GAC CCC GCT GTG TTT GCC GCC AGC CAG GGG GCA TAT GGA GGT GTG TGT 1782 
Asp Pro Ala Val Phe Ala Ala Ser Gin Gly Ala Tyr Gly Gly Val Cys 

Asp Asn Cys Arg Asp His Thr Glu Gly Lys Asn Cys Glu Arg Cys Gin 
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345 350 355 

CTG CAC TAT TTC CGG AAC CGG CGC CCG GGA GCT TCC ATT CAG GAG ACC 1878 
Leu His Tyr Phe Arg Asn Arg Arg Pro Gly Ala Ser lie Gin Glu Thr 
360 365 370 375 

TGC ATC TCC TGC GAG TGT GAT CCG GAT GGG GCA GTG CCA GGG GCT CCC 1926 
Cys lie Ser Cys Glu Cys Asp Pro Asp Gly Ala Val Pro Gly Ala Pro 

380 385 390 

TGT GAC CCA GTG ACC GGG CAG TGT GTG TGC AAG GAG CAT GTG CAG GGA 1974 
Cys Asp Pro Val Thr Gly Gin Cys Val Cys Lys Glu His Val Gin Gly 

395 400 405 

GAG CGC TGT GAC CTA TGC AAG CCG GGC TTC ACT GGA CTC ACC TAC GCC 2022 
Glu Arg Cys Asp Leu Cys Lys Pro Gly Phe Thr Gly Leu Thr Tyr Ala 

410 415 420 

AAC CCG CAG GGC TGC CAC CGC TGT GAC TGC AAC ATC CTG GGG TCC CGG 2070 
Asn Pro Gin Gly Cys His Arg Cys Asp Cys Asn lie Leu Gly Ser Arg 

425 430 435 

AGG GAC ATG CCG TGT GAC GAG GAG AGT GGG CGC TGC CTT TGT CTG CCC 2118 
Arg Asp Met Pro Cys Asp Glu Glu Set Gly Arg Cys Leu Cys Leu Pro 
440 445 450 455 

AAC GTG GTG GGT CCC AAA TGT GAC CAG TGT GCT CCC TAC CAC TGG AAG 2166 
Asn Val Val Gly Pro Lys Cys Asp Gin Cys Ala Pro Tyr His Trp Lys 

460 465 470 

CTG GCC AGT GGC CAG GGC TGT GAA CCG TGT GCC TGC GAC CCG CAC AAC 2214 
Leu Ala Ser Gly Gin Gly Cys Glu Pro Cys Ala Cys Asp Pro His Asn 

475 480 485 

TCC CTC AGC CCA CAG TGC AAC CAG TTC ACA GGG CAG TGC CCC TGT CGG 2262 

GAA GGC TTT GGT GGC CTG ATG TGC AGC GCT GCA GCC ATC CGC CAG TGT 2310 
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Glu Gly Phe Gly Gly Leu 
505 

CCA GAC CGG ACC TAT GGA 
Pro Asp Arg Thr Tyr Gly 
520 525 
TGT GAT TTC CGG GGA ACA 
Cys Asp Phe Arg Gly Thr 

540 

CGC TGC CTC TGC CGC CCT 
Arg Cys Leu Cys Arg Pro 

555 

CAG CGA GGC TAC TGC AAT 
Gin Arg Gly Tyr Cys Asn 
570 

TGC TTC CAG ACC TAT GAT 
Cys Phe Gin Thr Tyr Asp 
585 

GGT AGA CTC CGC AAT GCC 
Gly Arg Leu Arg Asn Ala 
600 605 

GAG GAC CGT GGC CTG GCC 
Glu Asp Arg Gly Leu Ala 

620 

GAG CAG ATC CGA GCA GTT 
Glu Gin lie Arg Ala Val 

635 
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Met Cys Ser Ala Ala Ala 
510 515 
GAC GTG GCC ACA GGA TGC 
Asp Val Ala Thr Gly Cys 

530 

GAG GGC CCG GGC TGC GAC 
Glu Gly Pro Gly Cys Asp 

545 

GGC TTG ACC GGG CCC CGC 
Gly Leu Thr Gly Pro Arg 
560 

CGC TAC CCG GTG TGC GTG 
Arg Tyr Pro Val Cys Val 
575 

GCG GAC CTC CGG GAG CAG 
Ala Asp Leu Arg Glu Gin 
590 595 
ACC GCC AGC CTG TGG TCA 
Thr Ala Ser Leu Trp Ser 

610 

TCC CGG ATC CTA GAT GCA 
Ser Arg lie Leu Asp Ala 

625 

CTC AGC AGC CCC GCA GTC 
Leu Ser Ser Pro Ala Val 
640 
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lie Arg Gin Cys 

CGA GCC TGT GAC 2358 
Arg Ala Cys Asp 

535 

AAG GCA TCA GGC 24 06 

Lys Ala Ser Gly 
550 

TGT GAC CAG TGC 2454 
Cys Asp Gin Cys 
565 

GCC TGC CAC CCT 2502 

Ala Cys His Pro 

580 

GCC CTG CGC TTT 2550 
Ala Leu Arg Phe 

GGG CCT GGG CTG 2598 
Gly Pro Gly Leu 

615 

AAG AGT AAG ATT 264 6 

Lys Ser Lys lie 
630 

ACA GAG CAG GAG 2694 
Thr Glu Gin Glu 
645 



650 



655 



660 
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GGC CTG CAG CTG GAT CTG 
Gly Leu Gin Leu Asp Leu 
665 

AGA GAC CTG GAG AGT CTT 
Arg Asp Leu Glu Ser Leu 
680 685 
TAT CAG AGG AAG AGG GAG 
Tyr Gin Arg Lys Arg Glu 

700 

TCA GGA GCC TTC CGG ATG 
Ser Gly Ala Phe Arg Met 

715 

GCT GCT CAG CAG GTC TCC 
Ala Ala Gin Gin Val Ser 
730 

GAC AGC CGG AGA GAG GCA 
Asp Ser Arg Arg Glu Ala 
745 

GGA GGC ACC GGC AGC CCC 
Gly Gly Thr Gly Ser Pro 
760 765 
TCG TTG CCT GAC CTG ACA 
Ser Leu Pro Asp Leu Thr 

780 

AGG CAG ATG GCT TGC ACC 
Arg Gin Met Ala Cys Thr 
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CCC CTG GAG GAG GAG ACG 
Pro Leu Glu Glu Glu Thr 
670 675 
GAC AGA AGC TTC AAT GGT 
Asp Arg Ser Phe Asn Gly 

690 

CAG TTT GAA AAA ATA AGC 
Gin Phe Glu Lys lie Ser 

705 

CTG AGC ACA GCC TAC GAG 
Leu Ser Thr Ala Tyr Glu 
720 

GAC AGC TCG CGC CTT TTG 
Asp Ser Ser Arg Leu Leu 
735 

GAG AGG CTG GTG CGG CAG 
Glu Arg Leu Val Arg Gin 
750 755 
AAG CTT GTG GCC CTG AGG 
Lys Leu Val Ala Leu Arg 

770 

CCC ACC TTC AAC AAG CTC 
Pro Thr Phe Asn Lys Leu 

785 

CCA ATA TCA TGC CCT GGT 
Pro lie Ser Cys Pro Gly 
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TTG TCC CTT CCG 2790 
Leu Ser Leu Pro 

CTC CTT ACT ATG 2838 
Leu Leu Thr Met 

695 

AGT GCT GAT CCT 2886 
Ser Ala Asp Pro 
710 

CAG TCA GCC CAG 2934 
Gin Ser Ala Gin 
725 

GAC CAG CTC AGG 2982 

Asp Gin Leu Arg 

740 

GCG GGA GGA GGA 3030 
Ala Gly Gly Gly 

CTG GAG ATG TCT 3078 
Leu Glu Met Ser 

775 

TGT GGC AAC TCC 3126 
Cys Gly Asn Ser 
790 

GAG CTA TGT CCC 3174 
Glu Leu Cys Pro 



. A A. i i A . A A ■ i v i . A i . . ■ > r v . . t i i . , t\ \ - ■ • ' A 

Gin Asp Asn Gly Thr Ala Cys Gly Ser Arg Cys Arg Gly Val Leu Pro 
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110 
















810 










815 










820 










AGG GCC GGT 


GGG 


GCC 


TTC 


TTG 


ATG 


GCG 


GGG 


CAG 


GTG 


GCT 


GAG 


CAG 


CTG 


3270 


Arg Ala Gly 


Gly 


Ala 


Phe 


Leu 


Met 


Ala 


Gly 


Gin 


Val 


Ala 


Glu 


Gin 


Leu 




825 








830 










835 












CGG GGC TTC 


AAT 


GCC 


CAG 


CTC 


CAG 


CGG 


ACC 


AGG 


CAG 


ATG 


ATT 


AGG 


GCA 


3318 


Arg Gly Phe 


Asn 


Ala 


Gin 


Leu 


Gin 


Arg 


Thr 


Arg 


Gin 


Met 


He 


Arg 


Ala 




840 






845 










850 










855 




GCC GAG GAA 


TCT 


GCC 


TCA 


CAG 


ATT 


CAA 


TCC 


AGT 


GCC 


CAG 


CGC 


TTG 


GAG 


3366 


Ala Glu Glu 


Ser 


Ala 


Ser 


Gin 


He 


Gin 


Ser 


Ser 


Ala 


Gin 


Arg 


Leu 


Glu 








860 










865 










870 






ACC CAG GTG 


AGC 


GCC 


AGC 


CGC 


TCC 


CAG 


ATG 


GAG 


GAA 


GAT 


GTC 


AGA 


CGC 


3414 


Thr Gin Val 


Ser 


Ala 


Ser 


Arg 


Ser 


Gin 


Met 


Glu 


Glu 


Asp 


Val 


Arg 


Arg 






875 










880 










885 








ACA CGG CTC 


CTA 


ATC 


CAG 


CAG 


GTC 


CGG 


GAC 


TTC 


CTA 


ACA 


GAC 


CCC 


GAC 


3462 


Thr Arg Leu 


Leu 


lie 


Gin 


Gin 


Val 


Arg 


Asp 


Phe 


Leu 


Thr 


Asp 


Pro 


Asp 




890 










895 










900 










ACT GAT GCA 


GCC 


ACT 


ATC 


CAG 


GAG 


GTC 


AGC 


GAG 


GCC 


GTG 


CTG 


GCC 


CTG 


3510 


Thr Asp Ala 


Ala 


Thr 


He 


Gin 


Glu 


Val 


Ser 


Glu 


Ala 


Val 


Leu 


Ala 


Leu 




905 








910 










915 












TGG CTG CCC 


ACA 


GAC 


TCA 


GCT 


ACT 


GTT 


CTG 


CAG 


AAG 


ATG 


AAT 


GAG 


ATC 


3558 


Trp Leu Pro 


Thr 


Asp 


Ser 


Ala 


Thr 


Val 


Leu 


Gin 


Ly s 


Met 


Asn 


Glu 


He 




920 






925 










930 










935 




CAG GCC ATT 


GCA 


GCC 


AGG 


CTC 


CCC 


AAC 


GTG 


GAC 


TTG 


GTG 


CTG 


TCC 


CAG 


3606 


Gin Ala lie 


Ala 


Ala 


Arg 


Leu 


Pro 


Asn 


Val 


Asp 


Leu 


Val 


Leu 


Ser 


Gin 








940 










945 










950 






ACC AAG CAG 


GAC 


ATT 


GCG 


CGT 


GCC 


CGC 


CGG 


TTG 


CAG 


GCT 


GAG 


GCT 


GAG 


3654 


GAA GCC AGG 


AGC 


CGA 


GCC 


CAT 


GCA 


GTG 


GAG 


GGC 


CAG 


GTG 


■ r ■ 

GAA 


GAT 


GTG 


3702 




« 
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Glu Ala Arg Ser Arg Ala His Ala Val Glu Gly Gin Val Glu Asp Val 

970 975 980 

GTT GGG AAC CTG CGG CAG GGG ACA GTG GCA CTG CAG GAA GCT CAG GAC 3750 
Val Gly Asn Leu Arg Gin Gly Thr Val Ala Leu Gin Glu Ala Gin Asp 

985 990 995 

ACC ATG CAA GGC ACC AGC CGC TCC CTT CGG CTT ATC CAG GAC AGG GTT 3 798 

Thr Met Gin Gly Thr Ser Arg Ser Leu Arg Leu lie Gin Asp Arg Val 
1000 1005 1010 1015 

GCT GAG GTT CAG CAG GTA CTG CGG CCA GCA GAA AAG CTG GTG ACA AGC 3846 
Ala Glu Val Gin Gin Val Leu Arg Pro Ala Glu Lys Leu Val Thr Ser 

1020 1025 1030 

ATG ACC AAG CAG CTG GGT GAC TTC TGG ACA CGG ATG GAG GAG CTC CGC 3894 
Met Thr Lys Gin Leu Gly Asp Phe Trp Thr Arg Met Glu Glu Leu Arg 

1035 1040 1045 

CAC CAA GCC CGG CAG CAG GGG GCA GAG GCA GTC CAG GCC CAG CAG CTT 3942 
His Gin Ala Arg Gin Gin Gly Ala Glu Ala Val Gin Ala Gin Gin Leu 

1050 1055 1060 

GCG GAA GGT GCC AGC GAG CAG GCA TTG AGT GCC CAA GAG GGA TTT GAG 3990 
Ala Glu Gly Ala Ser Glu Gin Ala Leu Ser Ala Gin Glu Gly Phe Glu 

1065 1070 1075 

AGA ATA AAA CAA AAG TAT GCT GAG TTG AAG GAC CGG TTG GGT CAG AGT 4 038 

Arg lie Lys Gin Lys Tyr Ala Glu Leu Lys Asp Arg Leu Gly Gin Ser 
1080 1085 1090 1095 

TCC ATG CTG GGT GAG CAG GGT GCC CGG ATC CAG AGT GTG AAG ACA GAG 4 086 

Ser Met Leu Gly Glu Gin Gly Ala Arg lie Gin Ser Val Lys Thr Glu 

1100 1105 1110 



Mi M t ■ - 



1115 



1120 



1125 
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GAC ATG GAG TTG GAG CTG CTG CGG GGC AGC CAG GCC ATC ATG CTG CGC A 182 

Asp Met Glu Leu Glu Leu Leu Arg Gly Ser Gin Ala lie Met Leu Arg 

1130 1135 1140 

TCA GCG GAC CTG ACA GGA CTG GAG AAG CGT GTG GAG CAG ATC CGT GAC 4230 
Ser Ala Asp Leu Thr Gly Leu Glu Lys Arg Val Glu Gin lie Arg Asp 
1145 1150 1155 



CAC ATC AAT GGG CGC GTG CTC TAC TAT GCC ACC TGC AAG T 4270 
His lie Asn Gly Arg Val Leu Tyr Tyr Ala Thr Cys Lys 
1160 1165 1170 



GATGCTACAG 


CTTCCAGCCC 


GTTGCCCCAC 


TCATCTGCCG 


CCTTTGCTTT 


TGGTTGGGGG 


4330 


CAGATTGGGT 


TGGAATGCTT 


TCCATCTCCA 


GGAGACTTTC 


ATGCAGCCTA 


AAGTACAGCC 


4390 


TGGACCACCC 


CTGGTGTGTA 


G C TAG T AAG A 


T TAC C CTG AG 


CTGCAGCTGA 


GCCTGAGCCA 


4450 


ATGGGACAGT 


TACACTTGAC 


AGACAAAGAT 


GGTGGAGATT 


GGCATGCCAT 


TGAAAC TAAG 


4510 


AGCTCTCAAG 


TCAAGGAAGC 


TGGGCTGGGC 


AGTATCCCCC 


GCCTTTAGTT 


CTCCACTGGG 


4570 


GAGGAATCCT 


GGACCAAGCA 


CAAAAACTTA 


ACAAAAG TGA 


TGTAAAAATG 


AAAAGCCAAA 


4630 


TAAAAATCTT 


TGGAAAAGAG 


CCTGGAGGTT 


CAACGAG 






4667 



Sequence No.: 26 

Sequence length : 1086 

Sequence type: Nucleic acid 

Strandednes s : Double 

Topology: Linear 

Sequence kind : cDNA to mRNA 

Original source: 

Organism species: Homo sapiens 
Cell kind: Stomach cancer 



Code representing characteristics: CDS 
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Existence site: 138.. 506 
Characterization method: E 
Sequence description 

TTTAATTTCC CCGAAATCAG ACTGCTGCCT TGGACCGGGA CAGCTCGCGG CCCCCGAGAG 60 
CTCTAGCCGT CGAGGAGCTG CCTGGGGACG TTTGCCCTGG GGCCCCAGCC TGGCCCGGGT 120 
CACCCTGGCA TGAGGAG ATG GGC CTG TTG CTC CTG GTC CCA TTG CTC CTG 170 

Met Gly Leu Leu Leu Leu Val Pro Leu Leu Leu 
15 10 
CTG CCC GGC TCC TAC GGA CTG CCC TTC TAC AAC GGC TTC TAC TAC TCC 218 
Leu Pro Gly Ser Tyr Gly Leu Pro Phe Tyr Asn Gly Phe Tyr Tyr Ser 

15 20 25 

AAC AGC GCC AAC GAC CAG AAC CTA GGC AAC GGT CAT GGC AAA GAC CTC 266 
Asn Ser Ala Asn Asp Gin Asn Leu Gly Asn Gly His Gly Lys Asp Leu 

30 35 40 

CTT AAT GGA GTG AAG CTG GTG GTG GAG ACA CCC GAG GAG ACC CTG TTC 314 
Leu Asn Gly Val Lys Leu Val Val Glu Thr Pro Glu Glu Thr Leu Phe 

45 50 55 

ACC CGC ATC CTA ACT GTG GGC CCC CAG AGC CTG GGG TCC GAA GCT TTG 362 
Thr Arg lie Leu Thr Val Gly Pro Gin Ser Leu Gly Ser Glu Ala Leu 
60 65 70 75 

GCT TCC CCG ACC CGC AGA GCC GCT TGT ACG GTG TTT ACT GCT ACC GCC 410 
Ala Ser Pro Thr Arg Arg Ala Ala Cys Thr Val Phe Thr Ala Thr Ala 

80 85 90 

AGC ACT AGG ACC TGG GGC CCT CCC CTG CCG CAT TCC CTC ACT GGC TGT 458 
Ser Thr Arg Thr Trp Gly Pro Pro Leu Pro His Ser Leu Thr Gly Cys 

95 100 105 



■ f \ 



110 115 120 
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TAACTGT TTTTATACTT CTCAATTTAA ATTTTCTTTA AACATTTTTT TACTATTTTT 560 

TGTAAAGCAA ACAGAACCCA ATGCCTCCCT TTGCTCCTGG ATGCCCCACT CCAGGAATCA 620 

TGCTTGCTCC CCTGGGCCAT TTGCGGTTTT GTGGGCTTCT GGAGGGTTCC CCGCCATCCA 680 

GGCTGGTCTC CCTCCCTTAA GGAGGTTGGT GCCCAGAGTG GGCGG TGGCC TGTCTAGAAT 74 0 

GCCGCCGGGA GTCCGGGCAT GGTGGGCACA GTTCTCCCTG CCCCTCAGCC TGGGGGAAGA 800 

AGAGGGCCTC GGGGGCCTCC GGAGCTGGGC TTTGGGCCTC TCCTGCCCAC CTCTACTTCT 860 

CTGTGAAGCC GCTGACCCCA GTCTGCCCAC TGAGGGGCTA GGGCTGGAAG CCAGTTCTAG 920 

GCTTCCAGGC GAAAGCTGAG GGAAGGAAGA AACTCCCCTC CCCGTTCCCC TTCCCCTCTC 980 

GGTTCCAAAG AATCTGTTTT GTTGTCATTT GTTTCTCCTG TTTCCCTGTG TGGGGAGGGG 104 0 

CCCTCAGGTG TGTGTACTTT GGACAATAAA TGGTGCTATG ACTGCC 1086 

Sequence No. : 27 

Sequence length: 866 

Sequence type: Nucleic acid 

Strandedness : Double 

Topology: Linear 

Sequence kind: cDNA to mRNA 

Original source : 

Organism species: Homo sapiens 
Cell kind: Stomach cancer 
Clone name: HP10368 
Sequence characteristics: 

Code representing characteristics : CDS 

Existence site: 73.. 600 

Characterizat ion method : E 
Sequence description 

Met Glu Lys Tie Pro Val Ser Ala Phe Leu Leu Leu VaJ 
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1 

GCC CTC TCC TAC ACT CTG 
Ala Leu Ser Tyr Thr Leu 
15 

AAA AAG GAC ACA AAG GAC 
Ly s Ly s Asp Thr Lys Asp 
30 35 
AG A GGT TGG GGT GAC CAA 
Arg Gly Trp Gly Asp Gin 

50 

CTA TAT AAA TCC AAG ACA 
Leu Tyr Lys Ser Lys Thr 

65 

TTG GAT GAG TGC CCA CAC 
Leu Asp Glu Cys Pro His 
80 

AAT AAA GAA ATC CAG AAA 
Asn Lys Glu lie Gin Lys 
95 

GTT TAT GAA ACA ACT GAC 

Val Tyr Glu Thr Thr Asp 
110 115 
CCC AGG ATT ATG TTT GTT 
Pro Arg lie Met Phe Val 

130 

ACT GGA AGA TAT TCA AAC 
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5 

GCC AGA GAT ACC ACA GTC 
Ala Arg Asp Thr Thr Val 
20 25 
TCT CGA CCC AAA CTG CCC 
Ser Arg Pro Lys Leu Pro 

AO 

CTC ATC TGG ACT CAG ACA 
Leu lie Trp Thr Gin Thr 

55 

AGC AAC AAA CCC TTG ATG 
Ser Asn Lys Pro Leu Met 
70 

AGT CAA GCT TTA AAG AAA 
Ser Gin Ala Leu Lys Lys 
85 

TTG GCA GAG CAG TTT GTC 
Leu Ala Glu Gin Phe Val 
100 105 
AAA CAC CTT TCT CCT GAT 

Lys His Leu Ser Pro Asp 

120 

GAC CCA TCT CTG ACA GTT 
Asp Pro Ser Leu Thr Val 

135 

CGT CTC TAT GCT TAC GAA 




PCT/JP97/03239 



10 

AAA CCT GGA GCC 159 
Lys Pro Gly Ala 

CAG ACC CTC TCC 207 
Gin Thr Leu Ser 

45 

TAT GAA GAA GCT 255 
Tyr Glu Glu Ala 
60 

ATT ATT CAT CAC 303 
lie lie His His 
75 

GTG TTT GCT GAA 351 
Val Phe Ala Glu 
90 

CTC CTC AAT CTG 399 
Leu Leu Asn Leu 

GGC CAG TAT GTC 447 
Gly Gin Tyr Val 

125 

AGA GCC GAT ATC 4 95 

Arg Ala Asp lie 
140 

CCT GCA GAT ACA 54 3 



GCT CTG TTG CTT GAC AAC ATG AAG AAA GCT CTC AAG TTG CTG AAG ACT 



591 
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Ala Leu Leu Leu Asp Asn Met Lys Lys Ala Leu Lys Leu Leu Lys Thr 

160 165 170 

GAA TTG TAAAGAAAAA AAATCTCCAA GCCCTTCTGT CTGTCAGGCC TTG 64 0 

Glu Leu 
175 

AG AC T TG AAA CCAGAAGAAG TGTGAGAAGA CTGGCTAGTG TGGAAGCATA GTGAACACAC 700 
TGATTAGGTT ATGGTTTAAT GTTACAACAA CTATTTTTTA AGAAAAACAA GTTTTAGAAA 760 
TTTGGTTTCA AGTGTACATG TGTGAAAACA ATATTGTATA CTACCATAGT GAGCCATGAT 820 
TTTCTAAAAA AAAAAATAAA TGTTTTGGGG GTGTTCTGTT TTCTCC 866 
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Cla ims 

1 . Proteins containing any of the amino acid 
sequences represented by Sequence No . 1 to Sequence No . 5 

2. DNAs encoding any of the proteins as described i 
Claim 1 . 

3. cDNAs containing any of the base sequences 
represented by Sequence No. 10 to Sequence No. 18. 

4. cDNAs described in Claim 3 which comprise any of 
the base sequences represented by Sequence No. 19 to 
Sequence No . 27 . 
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EcoRI Smal PmaCI EcoRV 

GAATTCCACAGATCCCGGGTCACGTGGGATATCCCTCCTCTCCT 




Fig. 1 
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